Cómo crear flujos de trabajo de IA en el mundo real con AutoGen: guía paso a paso

Entonces, ¿has estado experimentando con modelos de lenguaje grandes y estás comenzando a integrar IA generativa en tus aplicaciones? ¡Eso es genial! Pero seamos realistas. Los modelos de lenguaje grandes no siempre se comportan como queremos. ¡Son como niños malvados con mente propia!

Pronto te darás cuenta de que las cadenas de indicaciones simples no son suficientes. A veces, necesitamos algo más. A veces, necesitamos flujos de trabajo con múltiples agentes. Ahí es donde entra en juego AutoGen .

Tomemos un ejemplo. Imaginemos que estamos creando una aplicación para tomar notas (claramente, el mundo no tiene suficientes). Pero queremos hacer algo especial. Queremos tomar la nota simple y sin formato que nos da un usuario y convertirla en un documento completamente reestructurado, con un resumen, un título opcional y una lista automatizada de tareas pendientes. Y queremos hacer todo esto sin sudar la gota gorda, al menos para nuestros agentes de IA.

Vale, ya sé lo que estás pensando: "¿No son estos programas para principiantes?". A eso te respondo que tienes razón. ¡Es cruel... pero tiene razón! Pero no te dejes engañar por la simplicidad del flujo de trabajo. Las habilidades que aprenderás aquí (como manejar agentes de IA, implementar el control del flujo de trabajo y administrar el historial de conversaciones) te ayudarán a llevar tu juego de IA al siguiente nivel.

¡Así que abróchese el cinturón, porque vamos a aprender cómo crear flujos de trabajo de IA usando AutoGen!

Antes de comenzar, tenga en cuenta que puede encontrar un enlace a todo el código fuente en GitHub .

Creación de flujos de trabajo personalizados

Comencemos con el primer caso de uso: “Generar un resumen de la nota seguido de un título condicional”. Para ser justos, en realidad no necesitamos usar agentes aquí. Pero bueno, tenemos que empezar por algún lado, ¿no?

Paso 1: Creación de nuestra configuración base LLM

Los marcos de trabajo de Agentic como AutoGen siempre requieren que configuremos los parámetros del modelo. Nos referimos al modelo y al modelo de respaldo que se utilizará, la temperatura e incluso configuraciones como el tiempo de espera y el almacenamiento en caché. En el caso de AutoGen, esa configuración se parece a esto:

 # build the gpt_configuration object base_llm_config = { "config_list": [ { "model": "Llama-3-8B-Instruct", "api_key": os.getenv("OPENAI_API_KEY"), "base_url": os.getenv("OPENAI_API_URL"), } ], "temperature": 0.0, "cache_seed": None, "timeout": 600, }

Como puedes ver, soy un gran fanático de la IA de código abierto y apuesto por Llama 3. Puedes hacer que AutoGen apunte a cualquier servidor de inferencia compatible con OpenAI simplemente configurando los valores api_key y base_url . Por lo tanto, siéntete libre de usar Groq, Together.ai o incluso vLLM para alojar tu modelo localmente. Estoy usando Inferix .

¡Es realmente así de fácil!

¡Tengo curiosidad! ¿Te interesaría una guía similar para el alojamiento de IA de código abierto? Cuéntamelo en los comentarios.

Paso 2: Creando nuestros agentes

Inicializar agentes conversacionales en AutoGen es bastante sencillo: simplemente proporcione la configuración base de LLM junto con un mensaje del sistema y listo.

 import autogen def get_note_summarizer(base_llm_config: dict): # A system message to define the role and job of our agent system_message = """You are a helpful AI assistant. The user will provide you a note. Generate a summary describing what the note is about. The summary must follow the provided "RULES". "RULES": - The summary should be not more than 3 short sentences. - Don't use bullet points. - The summary should be short and concise. - Identify and retain any "catchy" or memorable phrases from the original text - Identify and correct all grammatical errors. - Output the summary and nothing else.""" # Create and return our assistant agent return autogen.AssistantAgent( name="Note_Summarizer", # Lets give our agent a nice name llm_config=base_llm_config, # This is where we pass the llm configuration system_message=system_message, ) def get_title_generator(base_llm_config: dict): # A system message to define the role and job of our agent system_message = """You are a helpful AI assistant. The user will provide you a note along with a summary. Generate a title based on the user's input. The title must be witty and easy to read. The title should accurate present what the note is about. The title must strictly be less than 10 words. Make sure you keep the title short. Make sure you print the title and nothing else. """ # Create and return our assistant agent return autogen.AssistantAgent( name="Title_Generator", llm_config=base_llm_config, system_message=system_message, )

La parte más importante de la creación de agentes es el system_message . Tómese un momento para observar el system_message que he utilizado para configurar mis agentes.

Es importante recordar que la forma en que funcionan los agentes de IA en AutoGen es participando en una conversación . La forma en que interpretan y llevan adelante la conversación depende completamente del system_message con el que están configurados. Este es uno de los lugares en los que dedicará algo de tiempo para hacer las cosas bien.

Necesitamos un agente más. Un agente que actúe como representante de nosotros los humanos. Un agente que pueda iniciar la conversación con la “nota” como mensaje inicial.

 def get_user(): # A system message to define the role and job of our agent system_message = "A human admin. Supplies the initial prompt and nothing else." # Create and return our user agent return autogen.UserProxyAgent( name="Admin", system_message=system_message, human_input_mode="NEVER", # We don't want interrupts for human-in-loop scenarios code_execution_config=False, # We definitely don't want AI executing code. default_auto_reply=None, )

No ocurre nada extraño aquí. Solo tenga en cuenta que he establecido el parámetro default_auto_reply en None . Eso es importante. Establecerlo en none garantiza que la conversación finalice cada vez que se envíe un mensaje al agente de usuario.

Vaya, me olvidé por completo de crear esos agentes. Hagámoslo rápidamente.

 # Create our agents user = get_user() note_summarizer = get_note_summarizer(base_llm_config) title_generator = get_title_generator(base_llm_config)

Paso 3: Configurar la coordinación de agentes mediante un `GroupChat`

La última pieza del rompecabezas es lograr que nuestros agentes se coordinen. Necesitamos determinar la secuencia de su participación y decidir qué agentes deben participar.

Vale, es más de una pieza, pero ya me entiendes. 🙈

Una posible solución sería dejar que la IA determine la secuencia en la que participan los agentes. No es una mala idea. De hecho, es mi opción preferida cuando me enfrento a problemas complejos en los que la naturaleza del flujo de trabajo es dinámica.

Sin embargo, este enfoque tiene sus inconvenientes. ¡La realidad vuelve a golpear! El agente responsable de tomar estas decisiones a menudo necesita un modelo grande, lo que genera latencias y costos más altos. Además, existe el riesgo de que tome decisiones incorrectas.

En el caso de flujos de trabajo deterministas, en los que conocemos la secuencia de pasos de antemano, me gusta tomar las riendas y dirigir el barco yo mismo. Afortunadamente, AutoGen admite este caso de uso con una función útil llamada GroupChat .

 from autogen import GroupChatManager from autogen.agentchat.groupchat import GroupChat from autogen.agentchat.agent import Agent def get_group_chat(agents, generate_title: bool = False): # Define the function which decides the agent selection order def speaker_selection_method(last_speaker: Agent, group_chat: GroupChat): # The admin will always forward the note to the summarizer if last_speaker.name == "Admin": return group_chat.agent_by_name("Note_Summarizer") # Forward the note to the title generator if the user wants a title if last_speaker.name == "Note_Summarizer" and generate_title: return group_chat.agent_by_name("Title_Generator") # Handle the default case - exit return None return GroupChat( agents=agents, messages=[], max_round=3, # There will only be 3 turns in this group chat. The group chat will exit automatically post that. speaker_selection_method=speaker_selection_method, )

Imagina un GroupChat como un grupo de WhatsApp donde todos los agentes pueden chatear y colaborar. Esta configuración permite que los agentes se basen en el trabajo de los demás. La clase GroupChat , junto con una clase complementaria llamada GroupChatManager , actúa como los administradores del grupo y realiza un seguimiento de todos los mensajes que envía cada agente para garantizar que todos estén al tanto del historial de conversaciones.

En el fragmento de código anterior, hemos creado un GroupChat con un speaker_selection_method personalizado. El speaker_selection_method nos permite especificar nuestro flujo de trabajo personalizado. Aquí se muestra una representación visual del mismo.

Dado que el speaker_selection_method es esencialmente una función de Python, podemos hacer lo que queramos con él. Esto nos ayuda a crear flujos de trabajo realmente potentes. Por ejemplo, podríamos:

Forme pares de agentes para verificar el trabajo de cada uno.
Involucre a un agente “Solucionador de problemas” si alguno de los agentes anteriores genera un error.
Active webhooks a sistemas externos para informarles sobre el progreso realizado.

¡Imagina las posibilidades! 😜

Paso 4: Preparar todo e iniciar la conversación

El último paso es crear una instancia de GroupChat , envolverla dentro de un GroupChatManager e iniciar la conversación.

 # Create our group chat groupchat = get_group_chat([user, note_summarizer, title_generator], generate_title=True) manager = autogen.GroupChatManager(groupchat=groupchat, llm_config=base_llm_config) # Start the chat user.initiate_chat( manager, clear_history=True, message=note, )

Nota: El usuario está chateando con el GroupChatManager , no con los agentes individuales. No tiene idea de qué agentes se unirán a la conversación para brindar la respuesta final. ¿Qué astuto es esto?

El resultado será algo como esto:

 Admin (to chat_manager): Note: Convo with editor: - discuss titles and thumbnails - discuss video editing tips tracker - Zeeshan presents the tracker - what trick helps with what - he decidedls if we can experiment with something new - make sure all all videos since how config management works in k8s are backed u - make list of YouTube thumbnail templates - make list of YouTube idea generation limits -------------------------------------------------------------------------------- Next speaker: Note_Summarizer Note_Summarizer (to chat_manager): The note is about a conversation with an editor regarding video production. They discussed titles and thumbnails, as well as a video editing tips tracker presented by Zeeshan, which highlights tricks for specific tasks. Additionally, they ensured that all videos on Kubernetes configuration management are backed up and created lists of YouTube thumbnail templates and idea generation limits. -------------------------------------------------------------------------------- Next speaker: Title_Generator Title_Generator (to chat_manager): "Video Production Chat: Titles, Thumbnails, and Editing Tips" --------------------------------------------------------------------------------

Tomando el control de la conversación

A continuación, nos sumergiremos en el caso de uso final: tomar una “nota” determinada, reestructurarla para lograr mayor claridad y luego crear una lista de tareas para el usuario.

Así es como lo haremos:

Comenzaremos por identificar una lista de temas que se tratan en la nota. Esta lista es el motor de todo el proceso. Establece las secciones de la nota reformateada y determina el nivel de detalle de las tareas generadas.

Hay un pequeño problema. Al Paraphrazer y al agente Task_Creator no les importa realmente el resultado del otro. Solo les importa el resultado del agente Topic_Analyzer .

Por lo tanto, necesitamos una forma de evitar que las respuestas de estos agentes saturen el historial de conversaciones, o será un completo caos. Ya hemos tomado el control del flujo de trabajo; ¡ahora es hora de ser también el jefe del historial de conversaciones! 😎

Paso 1: Creación de los agentes

Lo primero es lo primero: debemos configurar nuestros agentes. No los voy a aburrir con los detalles, así que aquí está el código:

 def get_topic_analyzer(base_llm_config: dict): # A system message to define the role and job of our agent system_message = """You are a helpful AI assistant. The user will provide you a note. Generate a list of topics discussed in that note. The output must obey the following "RULES": "RULES": - Output should only contain the important topics from the note. - There must be atleast one topic in output. - Don't reuse the same text from user's note. - Don't have more than 10 topics in output.""" # Create and return our assistant agent return autogen.AssistantAgent( name="Topic_Analyzer", llm_config=base_llm_config, system_message=system_message, ) def get_paraphrazer(base_llm_config: dict): # A system message to define the role and job of our agent system_message = """You are a helpful AI content editor. The user will provide you a note along with a summary. Rewrite that note and make sure you cover everything in the note. Do not include the title. The output must obey the following "RULES": "RULES": - Output must be in markdown. - Make sure you use each points provided in summary as headers. - Each header must start with `##`. - Headers are not bullet points. - Each header can optionally have a list of bullet points. Don't put bullet points if the header has no content. - Strictly use "-" to start bullet points. - Optionally make an additional header named "Addional Info" to cover points not included in the summary. Use "Addional Info" header for unclassified points. - Identify and correct spelling & grammatical mistakes.""" # Create and return our assistant agent return autogen.AssistantAgent( name="Paraphrazer", llm_config=base_llm_config, system_message=system_message, ) def get_tasks_creator(base_llm_config: dict): # A system message to define the role and job of our agent system_message = """You are a helpful AI personal assistant. The user will provide you a note along with a summary. Identify each task the user has to do as next steps. Make sure to cover all the action items mentioned in the note. The output must obey the following "RULES": "RULES": - Output must be an YAML object with a field named tasks. - Make sure each task object contains fields title and description. - Extract the title based on the tasks the user has to do as next steps. - Description will be in markdown format. Feel free to include additional formatting and numbered lists. - Strictly use "-" or "dashes" to start bullet points in the description field. - Output empty tasks array if no tasks were found. - Identify and correct spelling & grammatical mistakes. - Identify and fix any errors in the YAML object. - Output should strictly be in YAML with no ``` or any additional text.""" # Create and return our assistant agent return autogen.AssistantAgent( name="Task_Creator", llm_config=base_llm_config, system_message=system_message, )

Paso 2: Crea un `GroupChat` personalizado

Lamentablemente, AutoGen no nos permite controlar el historial de conversaciones directamente. Por lo tanto, debemos continuar y ampliar la clase GroupChat con nuestra implementación personalizada.

 class CustomGroupChat(GroupChat): def __init__(self, agents): super().__init__(agents, messages=[], max_round=4) # This function get's invoked whenever we want to append a message to the conversation history. def append(self, message: Dict, speaker: Agent): # We want to skip messages from the Paraphrazer and the Task_Creator if speaker.name != "Paraphrazer" and speaker.name != "Task_Creator": super().append(message, speaker) # The `speaker_selection_method` now becomes a function we will override from the base class def select_speaker(self, last_speaker: Agent, selector: AssistantAgent): if last_speaker.name == "Admin": return self.agent_by_name("Topic_Analyzer") if last_speaker.name == "Topic_Analyzer": return self.agent_by_name("Paraphrazer") if last_speaker.name == "Paraphrazer": return self.agent_by_name("Task_Creator") # Return the user agent by default return self.agent_by_name("Admin")

Anulamos dos funciones de la clase base GroupChat :

append : Esto controla qué mensajes se agregan al historial de conversaciones.
select_speaker : esta es otra forma de especificar el speaker_selection_method .

Pero espera, al profundizar más en el código de AutoGen, me di cuenta de que GroupChatManager hace que cada agente también mantenga el historial de conversaciones. No me preguntes por qué. ¡Realmente no lo sé!

Entonces, ampliemos también el GroupChatManager para solucionar eso:

 class CustomGroupChatManager(GroupChatManager): def __init__(self, groupchat, llm_config): super().__init__(groupchat=groupchat, llm_config=llm_config) # Don't forget to register your reply functions self.register_reply(Agent, CustomGroupChatManager.run_chat, config=groupchat, reset_config=GroupChat.reset) def run_chat( self, messages: Optional[List[Dict]] = None, sender: Optional[Agent] = None, config: Optional[GroupChat] = None, ) -> Union[str, Dict, None]: """Run a group chat.""" if messages is None: messages = self._oai_messages[sender] message = messages[-1] speaker = sender groupchat = config for i in range(groupchat.max_round): # set the name to speaker's name if the role is not function if message["role"] != "function": message["name"] = speaker.name groupchat.append(message, speaker) if self._is_termination_msg(message): # The conversation is over break # We do not want each agent to maintain their own conversation history history # broadcast the message to all agents except the speaker # for agent in groupchat.agents: # if agent != speaker: # self.send(message, agent, request_reply=False, silent=True) # Pro Tip: Feel free to "send" messages to the user agent if you want to access the messages outside of autogen for agent in groupchat.agents: if agent.name == "Admin": self.send(message, agent, request_reply=False, silent=True) if i == groupchat.max_round - 1: # the last round break try: # select the next speaker speaker = groupchat.select_speaker(speaker, self) # let the speaker speak # We'll now have to pass their entire conversation of messages on generate_reply # Commented OG code: reply = speaker.generate_reply(sender=self) reply = speaker.generate_reply(sender=self, messages=groupchat.messages) except KeyboardInterrupt: # let the admin agent speak if interrupted if groupchat.admin_name in groupchat.agent_names: # admin agent is one of the participants speaker = groupchat.agent_by_name(groupchat.admin_name) # We'll now have to pass their entire conversation of messages on generate_reply # Commented OG code: reply = speaker.generate_reply(sender=self) reply = speaker.generate_reply(sender=self, messages=groupchat.messages) else: # admin agent is not found in the participants raise if reply is None: break # The speaker sends the message without requesting a reply speaker.send(reply, self, request_reply=False) message = self.last_message(speaker) return True, None

He realizado algunas modificaciones menores a la implementación original. Deberías poder seguir los comentarios para obtener más información.

Pero hay algo que realmente quiero destacar: puedes anular el método “run_chat” de GroupChatManager para incorporar tu propio motor de flujo de trabajo, como Apache Airflow o Temporal. ¡ Los profesionales de los sistemas distribuidos saben exactamente lo poderosa que es esta capacidad!

Paso 3: Preparar todo e iniciar la conversación

¡Armamos todo como en el ejemplo anterior y miramos a este bebé ronronear! 🐱

 # Create our agents user = get_user() topic_analyzer = get_topic_analyzer(base_llm_config) paraphrazer = get_paraphrazer(base_llm_config) task_creator = get_tasks_creator(base_llm_config) # Create our group chat groupchat = CustomGroupChat(agents=[user, topic_analyzer, paraphrazer, task_creator]) manager = CustomGroupChatManager(groupchat=groupchat, llm_config=base_llm_config) # Start the chat user.initiate_chat( manager, clear_history=True, message=note, ) # Lets print the count of tasks just for fun chat_messages = user.chat_messages.get(manager) if chat_messages is not None: for message in chat_messages: if message.get("name") == "Task_Creator": taskList = yaml.safe_load(message.get("content")) # type: ignore l = len(taskList.get("tasks")) print(f"Got {l} tasks from Task_Creator.")

El resultado será algo como esto:

 Admin (to chat_manager): Note: Convo with editor: - discuss titles and thumbnails - discuss video editing tips tracker - Zeeshan presents the tracker - what trick helps with what - he decidedls if we can experiment with something new - make sure all all videos since how config management works in k8s are backed u - make list of YouTube thumbnail templates - make list of YouTube idea generation limits -------------------------------------------------------------------------------- Topic_Analyzer (to chat_manager): Here is the list of topics discussed in the note: 1. Titles 2. Thumbnails 3. Video editing tips 4. Config management in Kubernetes (k8s) 5. YouTube thumbnail templates 6. YouTube idea generation limits -------------------------------------------------------------------------------- Paraphrazer (to chat_manager): Here is the rewritten note in markdown format: ## Titles - Discuss titles and thumbnails with the editor ## Video Editing Tips Tracker ### Zeeshan presents the tracker - What trick helps with what - He decides if we can experiment with something new ## Config Management in Kubernetes (k8s) - Make sure all videos since how config management works in k8s are backed up ## YouTube Thumbnail Templates - Make a list of YouTube thumbnail templates ## YouTube Idea Generation Limits - Make a list of YouTube idea generation limits ## Additional Info - Discuss video editing tips tracker with Zeeshan - Present the tracker and decide if we can experiment with something new -------------------------------------------------------------------------------- Task_Creator (to chat_manager): tasks: - title: Discuss Titles and Thumbnails description: >- - Discuss titles and thumbnails with the editor This task involves having a conversation with the editor to discuss the titles and thumbnails for the videos. - title: Discuss Video Editing Tips Tracker description: >- - Zeeshan presents the tracker - Discuss what trick helps with what - Decide if we can experiment with something new This task involves discussing the video editing tips tracker presented by Zeeshan, understanding what tricks help with what, and deciding if it's possible to experiment with something new. - title: Back up All Videos Since How Config Management Works in k8s description: >- - Make sure all videos since how config management works in k8s are backed up This task involves ensuring that all videos related to config management in Kubernetes (k8s) are backed up. - title: Create List of YouTube Thumbnail Templates description: >- - Make list of YouTube thumbnail templates This task involves creating a list of YouTube thumbnail templates. - title: Create List of YouTube Idea Generation Limits description: >- - Make list of YouTube idea generation limits This task involves creating a list of YouTube idea generation limits. -------------------------------------------------------------------------------- Got 5 tasks from Task_Creator.

Sí. ¡Bienvenidos a la era de la IA que nos da trabajo a los humanos! (¿Dónde salió todo mal? 🤷‍♂️)

Conclusión

Desarrollar aplicaciones basadas en IA generativa es difícil, pero se puede lograr con las herramientas adecuadas. En resumen:

Los agentes de IA desbloquean esta increíble capacidad de modelar problemas complejos como conversaciones.
Si se hace correctamente, puede ayudar a que nuestras aplicaciones de IA sean más deterministas y confiables .
Herramientas como AutoGen nos proporcionan un marco con abstracciones simples para construir agentes de IA.

Como próximos pasos, puedes consultar los siguientes recursos para profundizar en el mundo de los agentes de IA:

Cómo crear flujos de trabajo de IA en el mundo real con AutoGen: guía paso a paso

Demasiado Largo; Para Leer

Creación de flujos de trabajo personalizados

Paso 1: Creación de nuestra configuración base LLM

Paso 2: Creando nuestros agentes

Paso 3: Configurar la coordinación de agentes mediante un `GroupChat`

Paso 4: Preparar todo e iniciar la conversación

Tomando el control de la conversación

Paso 1: Creación de los agentes

Paso 2: Crea un `GroupChat` personalizado

Paso 3: Preparar todo e iniciar la conversación

Conclusión

About Author

ETIQUETAS

ESTE ARTÍCULO FUE PRESENTADO EN...

Trending Topics

Classic

Neon Noir

Minty

Newspaper

HN StartUps

Cómo crear flujos de trabajo de IA en el mundo real con AutoGen: guía paso a paso

Demasiado Largo; Para Leer

Creación de flujos de trabajo personalizados

Paso 1: Creación de nuestra configuración base LLM

Paso 2: Creando nuestros agentes

Paso 3: Configurar la coordinación de agentes mediante un GroupChat

Paso 4: Preparar todo e iniciar la conversación

Tomando el control de la conversación

Paso 1: Creación de los agentes

Paso 2: Crea un GroupChat personalizado

Paso 3: Preparar todo e iniciar la conversación

Conclusión

About Author

ETIQUETAS

ESTE ARTÍCULO FUE PRESENTADO EN...

HISTORIAS RELACIONADAS

Trending Topics

Classic

Neon Noir

Minty

Newspaper

HN StartUps

Paso 3: Configurar la coordinación de agentes mediante un `GroupChat`

Paso 2: Crea un `GroupChat` personalizado