Authors:
(1) Savvas Petridis, Google Research, New York, New York, USA;
(2) Ben Wedin, Google Research, Cambridge, Massachusetts, USA;
(3) James Wexler, Google Research, Cambridge, Massachusetts, USA;
(4) Aaron Donsbach, Google Research, Seattle, Washington, USA;
(5) Mahima Pushkarna, Google Research, Cambridge, Massachusetts, USA;
(6) Nitesh Goyal, Google Research, New York, New York, USA;
(7) Carrie J. Cai, Google Research, Mountain View, California, USA;
(8) Michael Terry, Google Research, Cambridge, Massachusetts, USA.
Inspired by our findings from the formative studies and workshop, we built ConstitutionMaker, an interactive web tool that supports users in converting their feedback into principles to steer a chatbot’s behavior. ConstitutionMaker enables users to define a chatbot, converse with it, and within the conversation, interactively provide feedback to steer the chatbot’s behavior.
To illustrate how ConstitutionMaker works, let us consider a music enthusiast, Penelope, who would like to design a chatbot that helps you learn about and find new music, called MusicBot. She starts by entering the name of her bot and roughly describing its purpose in the “Capabilities” section of the interface (Figure 1A). She then starts a conversation with MusicBot, and after MusicBot’s introductory message, she asks to learn about punk music (Figure 1B). Fulfilling our first design goal, help users recognize ways to improve the bot’s responses, at each conversational turn, ConstitutionMaker provides three candidate responses from the bot (Figure 1D) that the user can compare and provide feedback on. Penelope peruses these candidate responses, and of the three responses, she likes the first one, as it invites the user to continue the conversation with a question at the end. She now wants to write a principle to help ensure that the chatbot will continue to do this in future conversations.
Fulfilling D.2, help convert user feedback into principles, ConstitutionMaker provides three principle-elicitation features to support users in converting their feedback to principles: kudos, critique, and rewrite. Since Penelope likes the response, she selects kudos underneath it (Figure 1D), which reveals a menu with three automatically generated rationales on why the response is good, as well as a text field for Penelope to enter her own reason. After scanning the rationales, she selects the second, as it closely matched her own feedback, and subsequently a principle is automatically generated from that rationale (Figure 1C). The critique (Figure 1F) and rewrite (Figure 1G) principle elicitation features work similarly, where Penelope can select a negative rationale or rewrite the model’s response to generate a principle respectively. She then inspects the generated principle, decides that it captured her intention well, and decides not to edit it.
Fulfilling D.3, enable easy testing of principle, she can then test to see if the chatbot is following her principle by rewinding the conversation (Figure 1H) to get a new set of candidate responses from the model. Ultimately, she decides to continue conversing with MusicBot, exploring different user journeys, and using the principle elicitation features to create a comprehensive set of principles.
This paper is available on arxiv under CC 4.0 license.