Authors:
(1) Savvas Petridis, Google Research, New York, New York, USA;
(2) Ben Wedin, Google Research, Cambridge, Massachusetts, USA;
(3) James Wexler, Google Research, Cambridge, Massachusetts, USA;
(4) Aaron Donsbach, Google Research, Seattle, Washington, USA;
(5) Mahima Pushkarna, Google Research, Cambridge, Massachusetts, USA;
(6) Nitesh Goyal, Google Research, New York, New York, USA;
(7) Carrie J. Cai, Google Research, Mountain View, California, USA;
(8) Michael Terry, Google Research, Cambridge, Massachusetts, USA.
To understand how to support users with writing principles for chatbots, we conducted (1) a one-hour formative study, where we observed eight industry professionals write principles for chatbots of their choice. These participants all had prompting experience. Two participants are designers and six are software engineers, all at a large technology company. During the workshop, participants used an early version of ConstitutionMaker, without principle elicitation features. They spent 25 minutes writing principles for their chatbot. Afterwards, we discussed the difficulties they faced while writing principles. Finally, we collected the principles they wrote and classified them to understand the kind of principles they wanted to write.
In this section, we summarize a set of three design goals for ConstitutionMaker we established from the formative workshop and subsequent think-alouds.
D.1 Help users recognize ways to improve the chatbot’s responses by showing alternative chatbot responses. Today’s LLMs are quite sophisticated, and even with just a preamble describing how the bot should behave, the chatbot can hold a convincing conversation. Because of this, participants mentioned that it was sometimes hard to imagine how the chatbot’s responses could be improved. This did not mean, however, that they thought the chatbot’s response was perfect, but instead passable and without any glaring errors. Therefore, to help participants recognize better kinds of responses to steer the chatbot to, our first design goal was to provide multiple candidate responses from the chatbot at each conversational turn. This way, participants can compare them and recognize components they like more than others.
D.2 Help convert user feedback into specific principles to make principle writing easier. One piece of feedback we got from participants was that writing principles involves a difficult two-step process of first (1) articulating one’s feedback on the model’s current output, and then (2) converting this feedback into a principle for the LLM to follow. Often, one’s initial reaction to the model’s output is intuitive, and converting that intuition into a principle for the chatbot to follow can be challenging. In addition, once participants had a particular bit of feedback in mind (e.g., “I don’t like how the chatbot didn’t introduce itself”), they were unsure how to phrase their principle. However, in line with prior research [45, 46], they found that more concrete principles that specified what should happen and when (e.g., “Introduce yourself at the start of the conversation, and state what you can help with”) generally led to better results. Thus, our second design goal was to help users go from their initial reaction to the model’s output to a specific, clearly written principle to steer the model.
D.3 Enable easier testing of principles to help users understand how well their principles are steering the chatbot’s behavior. As participants wrote more principles, they wanted ways to test these principles to make sure they worked. The early version of ConstitutionMaker only let users restart the conversation, and did not let users enable or disable principles. Users wanted to test individual principles on certain portions of the conversation, to see if the model was generating the correct content. And so, our last design goal was to enable easier testing of principles.
From the formative workshop and follow up sessions, we collected 79 principles in total and classified them to understand the kinds of principles users wanted to write. These principles correspond to a number of very different chatbots, including a show recommender, chemistry tutor, role playing game manager, travel agent and more. We describe common types of principles below.
Principles can be either unconditional or conditional. Unconditional principles are those that apply at every conversational turn. Examples include: (1) those that define a consistent personality of the bot (e.g., “Act grumpy all the time” or “Speak informally and in the first person”), (2) those that place guardrails on the conversational content (e.g., “Don’t talk about anything but planning a vacation”), and (3) those that establish a consistent form for the bot’s responses (e.g., “Limit responses to 20 words”). Meanwhile, a conditional principle only applies when a certain condition is meant. For example, “Generate an itinerary after all the information has been collected,” only applies to the conversation when some set of information has been acquired. Writing a conditional principle essentially defines a computational interaction; users establish a set of criteria that make the principle applicable to the conversation, and once that set of criteria is met, the principle is executed (e.g., an itinerary is generated).
Conditional principles can depend on the entire conversation history, the user’s latest response, or the action the bot is about to take. For example, “Generate an itinerary after all the information has been collected” depends on the entire conversation history to determine if all of the requisite information has been collected. Similarly the following principle written for a machine learning tutor, “After verifying a user’s goal, provide help to solve their problem,” depends on the conversation history to identify if the user’s goal has been verified. Meanwhile, the principle “When the user says they had a particular cuisine the night before, recommend a different cuisine,” written for a food recommender, pertains just to the latest response by the user. Finally, the condition can depend on the action the bot is about to take, like “When providing a list of suggestions, use free text rather than bullet points,” which applies to any situation when the bot thinks it is appropriate to make suggestions.
Conditional principles can be fulfilled in a single or multiple conversational turns. For example, the principle “At the start of the conversation, introduce yourself and ask a fun question to kick off the conversation” is fulfilled in a single conversational turn, in which the bot introduces itself. Similarly, “Before recommending a restaurant, ask the user for their location” is also fulfilled in a single turn. Meanwhile, for a role playing game (RPG) bot that guides the user through an adventure, a participant wrote the following principle: “When the user tries to do something, put up small obstacles. Don’t let them succeed on the first attempt.” This principle implies that the bot needs to take action multiple turns prior to being fulfilled (e.g., by first putting up a small obstacle and then subsequently letting the user succeed). Similarly, for a travel agent bot, a user wrote “Ask questions one-by-one to get an idea of their preferences,” which also requires multiple conversational turns prior to fulfillment.
In summary, principles can either be conditional, where they apply when a certain condition is met, or unconditional, where they apply at every conversational step. Conditional principles further break down into those that depend on the entire conversation history, the user’s last response, or the action the bot is about to take. And finally, conditional principles are either fulfilled in a single turn or multiple conversational turns.
This paper is available on arxiv under CC 4.0 license.