Table of Links Abstract and 1 Introduction Abstract and 1 Introduction 2 Survey with Industry Professionals 2 Survey with Industry Professionals 3 RQ1: Real-World use cases that necessitate output constraints 3 RQ1: Real-World use cases that necessitate output constraints 4 RQ2: Benefits of Applying Constraints to LLM Outputs and 4.1 Increasing Prompt-based development Efficiency 4 RQ2: Benefits of Applying Constraints to LLM Outputs and 4.1 Increasing Prompt-based development Efficiency 4.2 Integrating with Downstream Processes and Workflows 4.2 Integrating with Downstream Processes and Workflows 4.3 Satisfying UI and Product Requirements and 4.4 Improving User Experience, Trust, and Adoption 4.3 Satisfying UI and Product Requirements and 4.4 Improving User Experience, Trust, and Adoption 5 How to Articulate output constraints to LLMS and 5.1 The case for GUI: A Quick, Reliable, and Flexible Way of Prototyping Constraints 5 How to Articulate output constraints to LLMS and 5.1 The case for GUI: A Quick, Reliable, and Flexible Way of Prototyping Constraints 5.2 The Case for NL: More Intuitive and Expressive for Complex Constraints 5.2 The Case for NL: More Intuitive and Expressive for Complex Constraints 6 The Constraint maker Tool and 6.1 Iterative Design and User Feedback 6 The Constraint maker Tool and 6.1 Iterative Design and User Feedback 7 Conclusion and References 7 Conclusion and References A. The Survey Instrument A. The Survey Instrument 4 RQ2: BENEFITS OF APPLYING CONSTRAINTS TO LLM OUTPUTS Beyond the aforementioned use cases, our survey respondents reported a range of benefits that the ability of constraining LLM output could offer. These include both developer-facing benefits, like increasing prompt-based development efficiency and streamlining integration with downstream processes and workflows, as well as user-facing benefits, like satisfying product and UI requirements and improving user experience and trust of LLMs (Table 2). Here are the most salient responses: developer-facing user-facing 4.1 Increasing Prompt-based Development Efficiency First and foremost, being able to constrain LLM outputs can significantly increase the efficiency of prompt-based engineering and development by reducing the trial and error currently needed to manage LLM unpredictability. Developers noted that the process of “defining the [output] format” alone is “time-consuming,” often requiring extensive prompt testing to identify the most effective one (consistent with what previous research has found [30, 41]). Additionally, they often need to “request multiple responses” and “iterating through them until find[ing] a valid one.” Therefore, being able to deterministically constrain the output format could not only save developers as much as “dozens of hours of work per week” spent on iterative prompt testing, but also reduce overall LLM inference costs and latency. “defining the [output] format” “time-consuming,” “request multiple responses” “iterating through them until find[ing] a valid one.” “dozens of hours of work per week” Another common practice that respondents reported is building complex infrastructure to post-process LLM outputs, sometimes referred to as “massaging [the output] after receiving.” For example, developers oftentimes had to “chase down ‘free radicals’ when writing error handling functions,” and felt necessary to include “custom logic” for matching and filtering, along with “further verification.” Thus, setting constraints before LLM generation may be the key to reducing such “ad-hoc plumbing code” post-generation, simplifying “maintenance,” and enhancing the overall “developer experience.” As one respondent vividly described: “it’s a much nicer experience if it (formatting the output in bullets) ‘just works’ without having to implement additional infra... “massaging [the output] after receiving.” “chase down ‘free radicals’ when writing error handling functions,” “custom logic” “further verification.” “ad-hoc plumbing code” “maintenance,” “developer experience.” “it’s a much nicer experience if it (formatting the output in bullets) ‘just works’ without having to implement additional infra... This paper is available on arxiv under CC BY-NC-SA 4.0 DEED license. This paper is available on arxiv under CC BY-NC-SA 4.0 DEED license. available on arxiv Authors: (1) Michael Xieyang Liu, Google Research, Pittsburgh, PA, USA (lxieyang@google.com); (2) Frederick Liu, Google Research, Seattle, Washington, USA (frederickliu@google.com); (3) Alexander J. Fiannaca, Google Research, Seattle, Washington, USA (afiannaca@google.com); (4) Terry Koo, Google, Indiana, USA (terrykoo@google.com); (5) Lucas Dixon, Google Research, Paris, France (ldixon@google.com); (6) Michael Terry, Google Research, Cambridge, Massachusetts, USA (michaelterry@google.com); (7) Carrie J. Cai, Google Research, Mountain View, California, USA (cjcai@google.com). Authors: Authors: (1) Michael Xieyang Liu, Google Research, Pittsburgh, PA, USA (lxieyang@google.com); (2) Frederick Liu, Google Research, Seattle, Washington, USA (frederickliu@google.com); (3) Alexander J. Fiannaca, Google Research, Seattle, Washington, USA (afiannaca@google.com); (4) Terry Koo, Google, Indiana, USA (terrykoo@google.com); (5) Lucas Dixon, Google Research, Paris, France (ldixon@google.com); (6) Michael Terry, Google Research, Cambridge, Massachusetts, USA (michaelterry@google.com); (7) Carrie J. Cai, Google Research, Mountain View, California, USA (cjcai@google.com).