This is the second part in a multi-part series on building Agents with OpenAI's Assistant API using the Python SDK. Recap of Part 1 In of this series, we created a simple conversational Agent using the OpenAI Assistant API. Part 1 Let's recall our definition of an Agent. An agent is a piece of software leveraging an aiming to mimic human behavior. That means it can not only converse and understand language, but it can also perform actions that have an impact on the real world. These actions are typically called tools. LLM (Large Language Model) I use the terms and interchangeably when it comes to functions that the agent is able to call. OpenAI also uses the term interchangeably with to refer to the same concept. It seems the jury is still out on which term will become the standard. tools functions action tools In Part 2 of the series, let's build our first tool! We will be building on top of the simple conversational Agent we built in Part 1. We will purposefully call our implementation an Agent and refer to the as an Assistant to easily distinguish between the two. OpenAI SDK implementation Prerequisites To follow along with this tutorial, you will need the following: Have read or at least copied the as your starting point Part 1 code Python3 installed on your machine An OpenAI API key Basic knowledge of Python programming Dependencies We have one new dependency , a library used for parsing docstrings in Python code. This is needed for our Agent to dynamically interpret and manage the functions it can call. More on this later. docstring_parser python3 -m venv venv
source venv/bin/activate
pip install docstring_parser The goal In Part 1, we created an Agent that represents a famous hobbit who spends too much time thinking about breakfast 🍳 The goal of this tutorial will be to add two abilities or tools to our Agent. The tools will: Add the ability for our Agent to eat a second breakfast if he has only had one or else eat lunch. Add the ability for him to tell us the current date. NOTE: For those of you who don't know, hobbits eat two breakfasts. Also, some humans, like myself... It's a ridiculous example, but it will clearly illustrate the data flow you can set up for more complex use cases that interact with real-world data. Setting up the mock database Let's do a small amount of prep work before we dive in. We will create a simple mock database represented by a new file containing a variable . In this case, the variable will keep track of how many breakfasts our hobbit has eaten. db.py breakfast_count breakfast_count breakfast_count = 1 While we are here, we will update the method on our class to get data from our mock db. get_breakfast_count_from_db Agent import db

class Agent:
    # ... (rest of code)
    def get_breakfast_count_from_db(self):
        return db.breakfast_count 😴... That's out of the way. Let's move on to the fun part. Our first tool A tool is quite simply a function that the Agent is aware it needs to call in a specific scenario. Therefore, let's first define a simple function that handles our breakfast logic. IMPORTANT: I want to draw your attention to the in the tool functions we are about to define. It's crucial to get the documentation syntax right, as this is what will be used to extract the correct OpenAI function format in JSON. More on this soon. Google Style docstring Create a new file with this content: tools.py import db


def eat_next_meal(breakfast_count: int):
    """
    Call this tool when user wants you to eat another meal.

    Args:
        breakfast_count (int): Value with same name from metadata.

    Returns:
        str: The meal you should eat next.
    """
    print("== eat_next_meal ==> tool called")
    
    if breakfast_count == 2:
        return "You have already eaten breakfast twice today. You eat lunch now."
    if breakfast_count == 1:
        db.breakfast_count += 1
        return "You have only eaten one breakfast today. You eat second breakfast now." Pretty simple, right? If is , eat second breakfast and update the database to reflect that 2 breakfasts have been eaten. If is , eat lunch. breakfast_count 1 breakfast_count 2 You could then return any string you want from the tool. This return value is then fed right back into the Run in order to inform the Assistant of the outcome of an action taken. If your agent only cares that the tool was called and completed successfully, you could just return a or message. "success" "failed" Let's quickly create the second tool for our Agent so that we can illustrate how multiple tools can be used. We will create a tool to allow our Agent to tell us what the current date is. def tell_the_date():
    """
    Call this tool when the user wants to know the date.

    Returns:
        str: The current date
    """
    print("== tell_the_date ==> tool called")
    current_date = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
    return f"The date is {current_date}" Providing the tools to our Agent Now, let's provide these new tools to our in . We create a dictionary that maps function names to their corresponding callable objects and pass this as the argument to our class instantiation. We could hardcode the dictionary keys as strings, but we all know how easy typos are, so I chose to leverage Python's built-in property on functions to make this less error-prone: Agent main.py tools __name__ from tools import eat_next_meal, tell_the_date

agent = Agent(name="Bilbo Baggins",
              personality="You are the accomplished and renown adventurer from The Hobbit. You act like you are a bit of a homebody, but you are always up for an adventure. You worry a bit too much about breakfast.",
              tools={
                  eat_next_meal.__name__: eat_next_meal,
                  tell_the_date.__name__: tell_the_date
              }) Updating our Agent to accept tools Let's put these new tools into our Agent's tool belt by updating the class constructor to accept a dictionary of tools. You could bake this right into the class using class methods, but since it's common for different Agents to have different functionalities, we will pass tools as a constructor parameter and set them dynamically as a class property for later use. Agent tool_belt class Agent:
    # ... (rest of code)
    def __init__(self, name: str, personality: str, tools: dict[str, callable]):
        self.name = name
        self.personality = personality
        self.client = openai.OpenAI(api_key="sk-*****")

        self.tool_belt = tools
        self.assistant = self.client.beta.assistants.create(
            name=self.name,
            model="gpt-4-turbo-preview",
        ) Our Agent now has a tool belt it can access, but the OpenAI Assistant is unaware of them. We could provide the tools when creating the Assistant ( ) by passing a parameter, but tools can change often in development, and it's clunky to have to update Assistants every time you modify a tool by calling . assistants.create tools assistants.update Therefore, a better approach is to provide the needed tools to every Run so that they are created dynamically each time. Let's do exactly that by adding the parameter to our run creation: tools class Agent:
    # ... (rest of code)
    def _create_run(self):
        count = self.get_breakfast_count_from_db()
        return self.client.beta.threads.runs.create(
            thread_id=self.thread.id,
            assistant_id=self.assistant.id,
            tools=self._get_tools_in_open_ai_format(), # add this line
            # ... (rest of code)
        ) Woah!? Wait a minute... ✋🏻 Where did come from? Why aren't we just passing directly? _get_tools_in_open_ai_format tool_belt Well, let's create that method, and I'll explain right after: import docstring_parser

class Agent:
    # ... (rest of code)
    def _get_tools_in_open_ai_format(self):
        python_type_to_json_type = {
            "str": "string",
            "int": "number",
            "float": "number",
            "bool": "boolean",
            "list": "array",
            "dict": "object"
        }

        return [
            {
                "type": "function",
                "function": {
                    "name": tool.__name__,
                    "description": docstring_parser.parse(tool.__doc__).short_description,
                    "parameters": {
                        "type": "object",
                        "properties": {
                            p.arg_name: {
                                "type": python_type_to_json_type.get(p.type_name, "string"),
                                "description": p.description
                            }
                            for p in docstring_parser.parse(tool.__doc__).params

                        },
                        "required": [
                            p.arg_name
                            for p in docstring_parser.parse(tool.__doc__).params
                            if not p.is_optional
                        ]
                    }
                }
            }
            for tool in self.tool_belt.values()
        ] 😵... Ok, you really don't need to try to understand what each line is doing here since I took the time to work this all out for you. All you need to know is that this method takes the class property and uses the library we installed at the beginning of this tutorial to parse the docstrings of each function and extract the correct . That is why correctly formatting the docstring is so important. tool_belt docstring_parser OpenAI JSON format Otherwise, we would have to define our functions as Python code and define them again manually in the OpenAI JSON format. That's too much room for human error in my book. Polling for tool calls As we saw in Part 1, we have a polling mechanism to determine the current of a Run. One of these statuses is . This means the assistant has determined that one or several tools need to be called based on the instructions and tool definitions provided. The Run will not continue until all the tools have been called and the results of those calls have been submitted to the Run. status requires_action Let's, therefore, update our method to respond appropriately to the status: _poll_run requires_action class Agent:
    # ... (rest of code)
    def _poll_run(self, run: Run):
        status = run.status
        start_time = time.time()
        while status != "completed":
            if status == 'failed':
                raise Exception(f"Run failed with error: {run.last_error}")
            if status == 'expired':
                raise Exception("Run expired.")
            # add the below code block
            if status == 'requires_action':
                self._call_tools(
                    run.id, run.required_action.submit_tool_outputs.tool_calls)

            # ... (rest of method) As you might have guessed, will contain all the logic to dynamically call the available tools. _call_tools Remember, the Assistant only knows what tools need to be called but doesn't actually call them. The Run returns a list of dictionaries, , that contain the names of the functions that need to be called along with the parameters to pass to those function calls. It's up to our custom Agent implementation to obey the Assistant's commands and call the actual function implementations. run.required_action.submit_tool_outputs.tool_calls Let's implement . We'll break this down as code comments so it's easier to understand what is happening: _call_tools import json

class Agent:
    # ... (rest of code)
    def _call_tools(self, run_id: str, tool_calls: list[dict]):
        # We create a tool_outputs list to collect the results of function calls.
        tool_outputs = []

        # We iterate over all the tool_calls to deal with them individually
        for tool_call in tool_calls:
            # We get the `function` object from the tool_call
            function = tool_call.function
            # We extract the arguments from the function object.
            # They are in JSON so we need to load them with the json module.
            function_args = json.loads(function.arguments)
            # We map the function name to our callable function in our Agent's tool belt.
            function_to_call = self.tool_belt[function.name]
            # We can now call the function with the provided arguments.
            function_response = function_to_call(**function_args)
            # We append the response to the tool_outputs list
            tool_outputs.append(
                {"tool_call_id": tool_call.id, "output": function_response})

        # Finally, we submit the tool outputs to OpenAI
        self.client.beta.threads.runs.submit_tool_outputs(
            thread_id=self.thread.id,
            run_id=run_id,
            tool_outputs=tool_outputs
        ) Once the OpenAI method has been called with all the tool call outputs, we have completed the requirements for the Run to move past the status. The status will then switch to before eventually reaching the status and returning a response to the user. submit_tool_outputs requires_action in_progress completed Running it Let's give it a run: python3 main.py

User: Good morning

Assistant: Good morning! What an excellent day it seems for an adventure... or perhaps for a hearty breakfast first. How can I assist you on this fine day?

User: Should you eat breakfast?

== eat_next_meal ==> tool called

Assistant: Ah, only one breakfast so far? Well then, it's clear what must be done. It's time for a second breakfast!

User: You should eat another meal.

== eat_next_meal ==> tool called

Assistant: Ah, having already enjoyed both breakfast and second breakfast, it seems it's time to move on to lunch! This is turning into quite the day of culinary adventures.

User: What day is it?

== tell_the_date ==> tool called

Assistant: It's the 19th of February, 2024. Seems like a perfect day to embark on a quest or to delve into the mysteries of Middle-earth. What plans do we have for today? Voilà! Your hobbit can tell the date and eat the correct meal now. Even though this example is quite silly, my intention is to provide you with a functioning code sample to build your own powerful agents. In Part 3, we will implement RAG (Retrieval Augmented Generation) using PostgreSQL. Wait, what? OpenAI Assistants have that built-in; why not use their implementation? I'll explain my reasoning in Part 3. Thank you for your reading. Happy to hear any thoughts and feedback in the comments. for more content like this. Follow me on Linkedin

Walkthroughs, tutorials, guides, and tips. This story will teach you how to do something new or how to do something better.

How to Build an Agent With an OpenAI Assistant in Python - Part 1: Conversational

Hire Me

Read My Stories

How to Build an Agent With an OpenAI Assistant in Python - Part 2: Function Calling / Tools

About Author

Comments

TOPICS

THIS ARTICLE WAS FEATURED IN

Related Stories

How to Build an Agent With an OpenAI Assistant in Python - Part 1: Conversational

Beep Beep Bop Bop: How to Deploy Multiple AI Agents Using Local LLMs

How Will Software Engineers Lose Their Jobs Within the Next 5 Years?

🎬 Introducing MetaGPT: Unleashing the Power of AI Agents for Complex Tasks

We Asked AI to Improve This Article

How to Build an Agent With an OpenAI Assistant in Python - Part 1: Conversational

How to Build an Agent With an OpenAI Assistant in Python - Part 1: Conversational

Beep Beep Bop Bop: How to Deploy Multiple AI Agents Using Local LLMs

How Will Software Engineers Lose Their Jobs Within the Next 5 Years?

🎬 Introducing MetaGPT: Unleashing the Power of AI Agents for Complex Tasks

We Asked AI to Improve This Article

How to Build an Agent With an OpenAI Assistant in Python - Part 1: Conversational

Light-Mode

Classic

Newspaper

Minty

Dark-Mode

Neon Noir

Minty

HN StartUps