Agents are classes that leverage on language models and are responsible for performing tasks, answering questions, or solving problems using various tools. They are often used to interact with data sources to assist with problem-solving.
An executor on the other hand, is a running instance of an agent and is used to execute tasks based on the agent’s decisions and configured tools.
Tools are instances of classes that perform specific tasks or provide specific utilities. These classes are derived from a base class called Tool
Examples of tools include:
SerpAPI
: It utilizes the SerpAPI
services for search engine result data.ReadFileTool
: Provides file reading capabilities.WriteFileTool
: Provides file writing capabilities.The agents
decide which tools to use, and the executors
executes them.
We will run through a quick workflow of how we actually use agents and executors in practice.
When initialising an AgentExecutor
, it accepts an AgentExecutorInput
, which contains the agent, tools, maximum iterations, and optional early stopping method. The AgentExecutor constructor sets the following properties:
agent
tools
returnIntermediateSteps
maxIterations
(default is 15)earlyStoppingMethod
(default is force
)You can create an AgentExecutor
using AgentExecutor.fromAgentAndTools
and providing the required input fields. Here’s a working example from the Langchain repository:
import { AgentExecutor, ZeroShotAgent } from "langchain/agents";
import { OpenAI } from "langchain/llms/openai";
import { SerpAPI } from "langchain/tools";
import { Calculator } from "langchain/tools/calculator";
export const run = async () => {
const model = new OpenAI({ temperature: 0 });const tools = [new SerpAPI(process.env.SERPAPI_API_KEY, {location: "Austin,Texas,United States",hl: "en",gl: "us",}),new Calculator(),];
const agent = new ZeroShotAgent({ allowedTools: ["search", "calculator"] });const agentExecutor = AgentExecutor.fromAgentAndTools({ agent, tools });
console.log("Loaded agent.");
const input = `Who is Olivia Wilde's boyfriend? What is his current age raised to the 0.23 power?`;
console.log(`Executing with input "${input}"...`);
const result = await agentExecutor.call({ input });
console.log(`Got output ${result.output}`);
};
In the code above, we are:
OpenAI
as the language modelSerpAPI
, Calculator
ZeroShotAgent
(there’s many types of agents with different use-cases, or you can create your own!)AgentExecutor
with the agent and tools.call
method on the agent executor to receive the computed output from the agentWhen you call the .call()
method of the AgentExecutor
, it triggers the execution process. This is actually just a loop to perform the following actions:
AgentFinish
or AgentAction
. For instance, if the parsed text contains a prefix declared in the variable FINAL_ANSWER_ACTION
(By default it’s Final Answer:
) the agent returns an instance of AgentFinish
. This enables the AgentExecutor
to interpret the decision made by the language model and carry out actual executions of tools.AgentFinish
object, the execution loop will be terminated, and the output will be returned using the getOutput
function, which computes the final output based on the agent's finish step, intermediate steps, and additional data from the agent.AgentAction
object, it will process the actions returned by the agent plan, calling the corresponding tools for each action and generating observations.shouldContinue
function determines to stop.getOutput
function.This looping process of planning, parsing, and executing tools, enables the agents to leverage on the decision making power of language models to build autonomous entities that can carry out more complicated tasks.
Interestingly, this article was ~80% generated using a tool i built called Genie (with langchain). The goal for Genie is to help developers understand complex code implementations in minutes instead of hours. Do give it a go if you’re interested! www.birdlabs.ai (Check it out if you’re interested)
Here are some of the questions I asked in order to understand the mechanics behind autonomous agents.
Getting a brief overview of the Agents and executors
2. Realising I need more information on what tools are
3. Diving deeper into the implementation of an agent executor
4. Realising I need to know how the Agent
interprets the responses from the language model
I’ll be dropping more tutorials that uses Genie! Follow my blog and let me know what you want to learn about next :)
Linkedin: https://www.linkedin.com/in/dion-neo-470a161a6/
Email: [email protected]
Twitter: @neowenshun