How ConstitutionMaker Utilizes LLMs for Chatbot Behavior Crafting

Written by feedbackloop | Published 2024/01/24
Tech Story Tags: human-centric-ai | ai-research | llm-research | chatbot-design | constitutionmaker | constitutional-ai | ai-model-refinement | user-experience-in-ai-design

TLDR Explore the technical aspects of ConstitutionMaker, an innovative chatbot customization tool. Learn how it utilizes a promptable LLM for generating dialogue prompts, showcasing top-3 LLM completions, and influencing future chatbot responses. Delve into the intricacies of three key principle elicitation features – kudos, critique, and rewrite – and understand their role in shaping the dialogue prompt. ConstitutionMaker's implementation empowers users in crafting principles dynamically, reshaping the landscape of interactive chatbot design.via the TL;DR App

Authors:

(1) Savvas Petridis, Google Research, New York, New York, USA;

(2) Ben Wedin, Google Research, Cambridge, Massachusetts, USA;

(3) James Wexler, Google Research, Cambridge, Massachusetts, USA;

(4) Aaron Donsbach, Google Research, Seattle, Washington, USA;

(5) Mahima Pushkarna, Google Research, Cambridge, Massachusetts, USA;

(6) Nitesh Goyal, Google Research, New York, New York, USA;

(7) Carrie J. Cai, Google Research, Mountain View, California, USA;

(8) Michael Terry, Google Research, Cambridge, Massachusetts, USA.

Table Of Links

Abstract & Introduction

Related Work

Formative Study

Constitution Maker

Implementation

User Study

Findings

Discussion

Conclusion and References

5 IMPLEMENTATION

ConstitutionMaker is a web application and utilizes an LLM [3] that is promptable in the same way as GPT-3 [4] or PaLM [5]. In the following section, we go through the implementation of ConstitutionMaker’s key features.

5.1 Facilitating the Conversation

To generate the chatbot’s response, ConstitutionMaker builds a dialogue prompt (Figure 3A) behind the scenes. The dialogue prompt consists of (1) a description of the bot’s capabilities, entered by the user (Figure 1A), (2) the current set of principles, and (3) the conversation history, ending with the user’s latest input. The prompt then generates the bot’s next response, for which we choose the top-3 completions outputted by the LLM to display to users (Figure 3B). When the conversation is restarted or rewound, the conversation history within the dialogue prompt is modified; in the case of restarting, the entire history is deleted, whereas for rewinding, everything after the rewind point is deleted. And finally, if the conversation gets too long for the prompt context window, we remove the oldest conversational turns until it fits.

5.2 Three Principle Elicitation Features

All three principle elicitation features output a principle that is then incorporated back into the dialogue prompt (Figure 3A) to influence future conversational turns. Giving kudos and critiquing a bot’s response consist of a similar process. For both, the selected bot output is fed into a few-shot prompt that generates rationales, either positive (Figure 3C) or negative (Figure 3D). The user’s selected rationale (or their own written rationale) is then sent to a few-shot prompt that converts this rationale into a principle (Figure 3F and 3G). This few-shot prompt leverages the conversation history to create a specific, conditional principle. For example, for MusicBot, if the critique is “The bot did not ask questions about the user’s preferences,” a specific, conditional principle might be “Prior to giving a music recommendation, ask the user what genres or artists they currently listen to.” Next, for critiques, after the principle is inserted into the dialogue prompt, new outputs are generated to show to the user (Figure 3G). Finally, for rewriting the bot’s response, we leverage a chain-of-thought [38] style prompt that first generates a “thought,” which reasons how the original and rewritten outputs are different from each other, and then generates a specific principle based on that reasoning. Constructing the prompt with a “thought” portion led to principles that captured the difference between the two outputs better than our earlier versions without it.


[3] anonymized for peer review

[4] https://openai.com/api/

[5] https://developers.generativeai.google/

This paper is available on arxiv under CC 4.0 license.


Written by feedbackloop | The FeedbackLoop offers premium product management education, research papers, and certifications. Start building today!
Published by HackerNoon on 2024/01/24