paint-brush
How We Integrated Downstream Processes and Workflowsby@structuring
New Story

How We Integrated Downstream Processes and Workflows

by StructuringMarch 19th, 2025
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

Because LLMs are often used as sub-components in larger pipelines, respondents emphasized that guaranteed constraints are critical to ensuring that the output of their work is compatible with downstream processes, such as downstream modules that expect a specific format or functional code as input.

People Mentioned

Mention Thumbnail

Company Mentioned

Mention Thumbnail
featured image - How We Integrated Downstream Processes and Workflows
Structuring HackerNoon profile picture
0-item

Abstract and 1 Introduction

2 Survey with Industry Professionals

3 RQ1: Real-World use cases that necessitate output constraints

4 RQ2: Benefits of Applying Constraints to LLM Outputs and 4.1 Increasing Prompt-based development Efficiency

4.2 Integrating with Downstream Processes and Workflows

4.3 Satisfying UI and Product Requirements and 4.4 Improving User Experience, Trust, and Adoption

5 How to Articulate output constraints to LLMS and 5.1 The case for GUI: A Quick, Reliable, and Flexible Way of Prototyping Constraints

5.2 The Case for NL: More Intuitive and Expressive for Complex Constraints

6 The Constraint maker Tool and 6.1 Iterative Design and User Feedback

7 Conclusion and References

A. The Survey Instrument

4.2 Integrating with Downstream Processes and Workflows

Because LLMs are often used as sub-components in larger pipelines, respondents emphasized that guaranteed constraints are critical to ensuring that the output of their work is compatible with downstream processes, such as downstream modules that expect a specific format or functional code as input. Specifically for code generation, they highlighted the necessity of constraining the output to ensure “executable” code that adheres to only “methods specified in the context” and avoids errors, such as hallucinating “unsupported operators” or “SQL ... in a different dialect.” Note that while the “function calling” features in the latest LLMs [8, 26] can “select” functions to call from a predefined list, users still have to implement these functions correctly by themselves.


Many studies indicate that LLMs are highly effective for creating synthetic datasets for AI training [9, 15, 38], and our survey respondents postulated that being able to impose constraints on LLMs could improve the datasets’ quality and integrity. For instance, one respondent wished that model-generated movie data would “not say a movie’s name when it describes its plot,” as they were going to train using this data for a “predictive model of the movie itself.” Any breach of such constraints could render the data “unusable.”


Furthermore, given the industry trend of continuously migrating to newer, more cost-effective models, respondents highlighted the importance of “canonizing” constraints across models to avoid extra prompt-engineering after migration (e.g., “if I switch model, I get the formatting immediately”). This suggests that it could be more advantageous for models to accept output constraints independent of the prompt, which should now solely contain task instructions.


This paper is available on arxiv under CC BY-NC-SA 4.0 DEED license.

Authors:

(1) Michael Xieyang Liu, Google Research, Pittsburgh, PA, USA (lxieyang@google.com);

(2) Frederick Liu, Google Research, Seattle, Washington, USA (frederickliu@google.com);

(3) Alexander J. Fiannaca, Google Research, Seattle, Washington, USA (afiannaca@google.com);

(4) Terry Koo, Google, Indiana, USA (terrykoo@google.com);

(5) Lucas Dixon, Google Research, Paris, France (ldixon@google.com);

(6) Michael Terry, Google Research, Cambridge, Massachusetts, USA (michaelterry@google.com);

(7) Carrie J. Cai, Google Research, Mountain View, California, USA (cjcai@google.com).