This is the first article in a two-part series on building AI Agents from the ground up. In this article, we will explore the value of AI Agents, introduce popular Agentic AI platforms, and walk through a hands-on tutorial for building a simple AI Agent. The second part of the series will dive deeper with a hands-on tutorial, where we’ll build Agents that can automate tasks and interact with external tools and APIs. Article 1. Getting Started with Agentic AI: Build Your First AI Agent with Phidata The use of the term “AI Agent” has increased by 10x in the last 1 year (Google Trends). Google Trends Example of an AI Agent application Before diving into the working of AI Agents, let’s begin with a relatable example of how AI Agents can transform everyday tasks in the near future. Imagine planning for a vacation. Today’s world:  Hotels, Flights and Rental cars are booked independently and places to visit are planned based on weather, preferences and family composition (single, couple, with kids).  It is a time-consuming and fragmented process. Today’s world Agentic AI world: Now, imagine simply giving a prompt like the following Agentic AI world I would like to book a family trip with 2 kids in the months of June/July for a weekend plus 2 days.  Do not include 2nd week/3rd week of June.  I would just need to carry two cabin bags, and prefer tasting the best local food.  Plan for an itinerary not longer than 2-3 hours drive from the city. I would like to book a family trip with 2 kids in the months of June/July for a weekend plus 2 days.  Do not include 2nd week/3rd week of June.  I would just need to carry two cabin bags, and prefer tasting the best local food.  Plan for an itinerary not longer than 2-3 hours drive from the city. An AI Agent could instantly generate a few tailored travel packages – with flights, hotels, cars and provide food recommendations, and an optimized itinerary, so you can just pick the one that fits your needs. Fundamentals of AI Agents In simple terms, AI Agents are systems that can perform tasks autonomously by interpreting the data from the environment, making decisions based on that data to achieve the goals.  Think of them as orchestrators – connecting various tools, using Large Language Models (LLM) to reason, plan and execute tasks.  For intro about LLMs, refer to these articles (link1, link2) Let’s breakdown this definition using the above vacation planning example: Perform tasks autonomously: Book flight, hotel, rental car reservations through the respective vendors.
Interpreting the data: It takes into account factors like weather, traffic and local events to suggest best activities that fits the pace.
Making decisions: Consider there are dozens of restaurants available, Agents can provide recommendations based on the indicated preference and past reviews.
Achieve goals: Ultimately, it puts together a travel plan that matches the requirements – dates, duration, preferences and family needs. Perform tasks autonomously: Book flight, hotel, rental car reservations through the respective vendors. Perform tasks autonomously Interpreting the data: It takes into account factors like weather, traffic and local events to suggest best activities that fits the pace. Interpreting the data Making decisions: Consider there are dozens of restaurants available, Agents can provide recommendations based on the indicated preference and past reviews. Making decisions Achieve goals: Ultimately, it puts together a travel plan that matches the requirements – dates, duration, preferences and family needs. Achieve goals Agentic AI Platforms An Agentic AI framework is a toolkit that enables the creation of AI systems capable of reasoning, planning, and taking actions autonomously or semi-autonomously through tool use and memory.  These frameworks provide the structure needed to create agents that can interact with their environment, make decisions and execute tasks. There are several popular Agentic AI platforms such as LangChain, CrewAI, Phidata.  For this tutorial, we will use the Phidata platform – a lightweight and developer-friendly platform.  Phidata comes with built-in access to a variety of tools and LLMs, allowing to build and deploy AI Agents within just a few lines of code. Image. Popular built-in Tools and Model wrappers in Phidata.  For a full list, links here – Models, Tools. Image. Popular built-in Tools and Model wrappers in Phidata.  For a full list, links here – Models, Tools. Build a Youtube summarizer Agent The Youtube Summarizer Agent is designed to extract key insights and main points from any YouTube video. It saves time by providing concise summaries without needing to watch the entire content.  For the purpose of the tutorial, we will use Google Colab notebook to write and execute the code and Phidata Agentic AI Platform to power the Agent. Model: Within Phidata, we will leverage the Groq model hosting platform – an inference service that runs LLMs on a dedicated GPU infrastructure (note that it is different from Grok which is a LLM from xAI). Since LLMs are resource intensive, using Groq helps to offload computation from the local hardware or Colab provided hardware, ensuring faster and more efficient execution. Groq has access to multiple models from different LLM providers. (see full list here) Model Tools: To retrieve YouTube video data, we will use the built-in Tool from Phidata framework (called YouTubeTools).  This tool helps us to access video metadata and captions which the agent then passes to the chosen LLM to generate accurate and insightful summaries. Tools Here is the code for a Youtube summarizer agent: from phi.agent import Agent 
from phi.model.groq import Groq 
from phi.model.openai import OpenAIChat 
from phi.tools.youtube_tools import YouTubeTools
agent = Agent( # model=Groq(id="llama3-8b-8192"), 
              model=Groq(id="llama-3.3-70b-versatile"),  ## Toggle with different LLM model 
              tools=[YouTubeTools()], 
              show_tool_calls=True, 
              # debug_mode=True, 
              description="You are a YouTube agent. Obtain the captions of a YouTube video and answer questions.", )

agent.print_response("Summarize this video https://www.youtube.com/watch?v=vStJoetOxJg", markdown=True, stream=True) from phi.agent import Agent 
from phi.model.groq import Groq 
from phi.model.openai import OpenAIChat 
from phi.tools.youtube_tools import YouTubeTools
agent = Agent( # model=Groq(id="llama3-8b-8192"), 
              model=Groq(id="llama-3.3-70b-versatile"),  ## Toggle with different LLM model 
              tools=[YouTubeTools()], 
              show_tool_calls=True, 
              # debug_mode=True, 
              description="You are a YouTube agent. Obtain the captions of a YouTube video and answer questions.", )

agent.print_response("Summarize this video https://www.youtube.com/watch?v=vStJoetOxJg", markdown=True, stream=True) Following is the output generated by the YouTube Summarizer agent (above code).  The youtube link in the above code is a video of Andrew Ng on the Machine Learning specialization.  As shown below, it accurately summarizes the video content.   Note that the response may vary for each run because of the probabilistic nature of LLMs. Detailed Tutorial Clone Notebook Clone Notebook Step 1: Clone colab notebook here (it requires Google account) Step 1 Step 2: Install dependencies (first cell with code) Step 2 Get API key for Groq Get API key for Groq In order to run the Agent, given we use the Groq model hosting platform, we need an account with Groq.  Follow the below steps to sign up / log in to Groq and get an API key. Step 1: Visit the Groq Developer Portal Open your browser and go to: https://console.groq.com Step 1 https://console.groq.com Step 2: Sign Up or Log In If you already have an account, click Log In. If you’re new, click Sign Up and follow the prompts to create an account (you may need to verify your email). Step 2 Step 3: Access the API Section Once logged in, you'll land on the Groq Console. Navigate to the API Keys section from the sidebar or dashboard. Step 3 Step 4: Generate a New API Key Click the “Create API Key” button. Give your key a name (e.g., "workshop-key"). Click Create or Generate. Step 4 Step 5: Copy and Store the Key Securely Your API key will be shown only once — copy it immediately and store it in a secured location. Never expose your API key in client-side code or public repositories. Step 5 Add the API key in the Secret Manager Add the API key in the Secret Manager Step 1: Click on Secrets (Key sign) on the left pane of colab Step 1 Step 2: Provide the name as GROQ_API_KEY and Value as API Key copied in Step 5 above Step 2 Step 3: Toggle "ON" the notebook access. Step 3

This story contains new, firsthand information uncovered by the writer.

Walkthroughs, tutorials, guides, and tips. This story will teach you how to do something new or how to do something better.

Build AI Agents: YouTube Summarizer Agent

About Author

Comments

TOPICS

THIS ARTICLE WAS FEATURED IN

Related Stories

The Zero-to-Agent Playbook

Daniel Saks Predicts Agentic AI Will Empower Individuals and Boost Productivity

Active Inference AI: Here's Why It's The Future of Enterprise Operations and Industry Innovation

The Trick That Agentic Frameworks Pulled On Us

Exploring the Spatial Web Protocol at the 4th Annual Applied Active Inference Symposium

What Is Agentic AI?

The Zero-to-Agent Playbook

Daniel Saks Predicts Agentic AI Will Empower Individuals and Boost Productivity

Active Inference AI: Here's Why It's The Future of Enterprise Operations and Industry Innovation

The Trick That Agentic Frameworks Pulled On Us

Exploring the Spatial Web Protocol at the 4th Annual Applied Active Inference Symposium

What Is Agentic AI?

Light-Mode

Classic

Newspaper

Minty

Dark-Mode

Neon Noir

Minty

HN StartUps