Authors:
(1) Kexun Zhang, UC Santa Barbara and Equal contribution;
(2) Hongqiao Chen, Northwood High School and Equal contribution;
(3) Lei Li, Carnegie Mellon University;
(4) William Yang Wang,UC Santa Barbara.
A syntactically correct tool call needs to refer to an existent tool name and pass type-conforming arguments. Motivated by the fact that it is straightforward to verify the syntax of a tool call using a finite-state machine (FSM), we propose TOOLDEC, a constrained decoding algorithm guided by an FSM. During each decoding step, the model samples from a subset of the vocabulary that only contains syntactically correct tokens. The FSM that specifies the token subsets can be constructed from the tool documentation. For example, in Figure 2, an FSM is constructed for functions add, exp, square and sqrt. Table 2 shows how TOOLDEC answers the question “the side of a square is 5, what’s its area?” using the FSM. With the guidance from the FSM, TOOLDEC achieves the following goals:
• Switching Modes. Switch between “text mode” when the model is free to generate any text and “tool mode” when the model can only generate valid tool calls.
• Generating Tool Names. At the beginning of a tool call, only generate correct existent tool names from a pre-defined list of tools.
• Passing Arguments. Only pass type-conforming arguments to the tool.
TOOLDEC is guided by a finite-state machine (FSM). An FSM is a 5-tuple (S, V, g, s0, R), consisting of a finite state set S, an alphabet V , a transition function g : S ×V → S, an initial state s0 and a set of accepting states R. In our case, S and g are constructed from the tool signature. V is the token vocabulary of the language model. R corresponds to pre-defined tokens that can determine the LM has completed the task, like ‘’.
At each decoding step t, TOOLDEC maintains a current state s. It can only generate the tokens permitted by the FSM, i.e. the tokens for which g(s, ·) is defined. These
To construct a trie, we insert all the strings into it one by one. Inserting a string into a trie means going from the root down the path made by the string and creating new nodes when the next step in the path does not exist. For example, we show how two more tools names, exp10 and expand can be added to the trie in Figure 3.
Note that the construction of trie depends on one assumption: no two tools have the same name. While this is a reasonable assumption to make, there could be exceptions in real applications. In that case, we could rewrite the tool names to include more details to disambiguate them. Rewriting abstract and hard-to-understand tool names can also make it easier for the language model to select them by name.
Generating Syntactically Valid Tool Arguments. Tool arguments have specified types. Like arguments in a program, they need to follow certain grammar rules. These rules can be specified by finite-state machines. For example, the “IntFSM” in Figure 2 depicts a finite-state machine that only accepts integer literals. For all arguments of a tool, we chain their corresponding FSMs together and use the last state corresponding to the tool name as the initial state of this FSM chain. Note that in practice, it’s not necessary to explicitly construct this FSM. Any grammar checker that tells the set of valid next tokens suffice.
This paper is available on arxiv under CC 4.0 DEED license.