How to Build an AI Agent That Actually Handles Boring Tasks for You

Ah, AI agents… the hottest trend in tech right now. Everyone's hyped about them being the future of work. After all, they can do it all and will automate most tasks to give us more time, right? Well… sort of.

The reality? Most agents get blocked by websites or get lost while trying to execute tasks. To actually make one that works, you need a best-in-class tech stack. Only the right combination of tools can turn an AI agent into a real task-automation machine.

Follow this tutorial and learn how to craft an AI agent that can truly automate tasks for you!

Why Most AI Agents Don’t Deliver

The dream of having AI automate tasks for us is exactly why AI agents were invented in the first place. It’s why “agentic AI” became a trend, and why the hype is still sky-high.

Imagine a world where all the tedious, repetitive stuff gets handled by AI so we can save time. Sounds perfect, right?

That way, we could focus on what really matters: stacking V-Bucks in Fortnite or grinding runes in Elden Ring.

Jokes aside, if you’ve ever played around with an AI agent like OpenAI Operator or tried building one yourself, you already know the sad truth: AI agents rarely live up to expectations!

These are some of the main reasons AI agents flop:

They can’t interact with websites or desktop apps like a real human would.
LLMs powering them can be unpredictable, giving different results on the same input.
Even when they do use a browser, anti-bot techniques like CAPTCHAs stop them cold.
Unlike humans, AI agents often lack common sense reasoning and struggle to adapt when faced with situations beyond their programming.

The problem isn’t the idea of AI agents. Instead, it’s the tech stack you use to build them.

So let’s stop wasting time and figure out how to build an AI agent that can actually automate browser tasks for you.

Make an AI Agent Automate the Stuff You Hate Doing: Step-by-Step Tutorial

In this chapter, you’ll be walked through building an AI agent that can handle one of the most boring (yet critical) tasks out there: job hunting!

The resulting AI agent will be smart enough to:

Visit Google
Discover job platforms
Browse listings based on your desired positions and preferences
Extract interesting jobs
Export them into a clean JSON file

And if you want to take it further, you’ll also find resources on how to feed it your CV so the agent can learn your profile and automatically apply to the best matches—all without you lifting a finger.

⚠️ Important: This is just an example! As mentioned before the end of this guide, the same agent can be adapted to almost any browser-based workflow by simply changing the task description.

Let’s dive in!

Prerequisites

To follow along with this tutorial, make sure you have:

An LLM API key (we’ll use Gemini, since it’s basically free to use via API, but OpenAI, Anthropic, Ollama, Groq, and others work as well).
A Bright Data account with the Browser API enabled (don’t worry about setup yet, as you’ll be guided through it in this tutorial).
Python ≥ 3.11 installed locally.

To speed things up, we’ll also assume you already have a Python project set up with an uv virtual environment in place.

Step #1: Install Browser Use

As mentioned earlier, most AI agents flop because they hit the wall of tech limitations 🧱. The models alone just aren’t enough. So what’s one of the best tools to build AI agents that can indeed do stuff inside a browser? 👉 Browser Use!

Never heard of it? No worries! Catch up with this video or take a look at its official docs:

https://www.youtube.com/watch?v=zGkVKix_CRU&embedable=true

First things first, activate your uv venv and install the browser-use package from PyPI:

uv pip install browser-use

Under the hood, this library runs on Playwright, so you’ll also need to grab the Chromium binaries it depends on. To do so, run:

uvx playwright install chromium --with-deps --no-shell

Boom! 💥 You’re now set up with a browser automation agentic AI powerhouse.

Step #2: Integrate the LLM

AI agents won’t do much without AI (shocker, right? 😅), so your agent needs a language model to properly think. Browser Use supports a long list of LLM providers, but we’ll focus on Gemini, the one highlighted on the official browser-use GitHub page.

Why Gemini? Because it’s one of the few LLMs with API access and generous rate limits that make it fundamentally free to play with. 🆓

Grab your Gemini API key and store it in a .env file in your project folder like this:

GEMINI_API_KEY=<YOUR_GEMINI_API_KEY>

Next, create an agent.py file, which will contain the AI agent definition logic. Start by reading the envs from .env using python-dotenv (which comes with browser-use):

from dotenv import load_dotenv

# Read the environment variables from the .env file
load_dotenv()

Then, define your LLM integration:

from browser_use import ChatGoogle

# The LLM powering the AI agent
llm = ChatGoogle(model="gemini-2.5-flash")

Amazing! You’ve got your AI engine ready. 🧠

Time to define and build the rest of your agent’s logic…

Step #3: Describe the Browser-Based Task to Automate

How you describe the task to your agent is everything. The LLM you configured in Browser Use only works as well as your instructions, so spend time crafting a prompt that’s clear, detailed, but not overly complicated.

This is the most important step in your implementation. Thus, check out guides on prompt design and follow the Browser-Use best practices to maximize results. You might need a few rounds of trial and error. 🧪

Since this is just an example, let's keep it simple and describe the browser job-hunting task like this:

task = """
Search on Google for software engineer jobs in New York.
1. Choose a job posting page.
2. On the chosen site, filter for jobs published within the last 24 hours.
3. For each job listing, extract the key details, including the job posting URL and the apply URL (if available).
4. Return all results as a JSON list.
"""

As you can see, you’re giving your agent a lot of freedom, which is totally fine considering how capable and flexible Browser Use is! 💪

💡 Tip: In a real-world setup, you should read preferences from a configuration file and inject them into your prompt. This makes your agent customizable for different searches. Think varying job titles, locations, required skills, company preferences, remote vs on-site, and more. For a similar approach, read our guide on building a LinkedIn job hunting AI assistant.

Step #4: Define and Run the Agent

Use Browser Use to spin up an AI agent controlled by your configured LLM that can tackle the task you defined earlier:

from browser_use import Agent

agent = Agent(
    llm=llm,    
    task=task,
)

Fire your agent like this:

history = agent.run_sync()

Perfect! Now all that’s left is to grab the output from your AI agent and export it to JSON (or any format you need). 💾

Step #5: Export the Output to JSON

Grab the output from your agent (which should be a clean JSON list of jobs) and dump it to a .json file:

import json

output_data = history.structured_output
with open("jobs.json", "w", encoding="utf-8") as f:
    json.dump(output_data, f, ensure_ascii=False, indent=4)

Here we go! Mission complete. Boring task handler agent at your service! 🫡

Step #6: Address the Agent Limitations

Browser Use is incredible—but not magical, unfortunately…

If you try to run your browser-based handler AI agent now, it’ll probably get blocked. That may occur because of a Google reCAPTCHA:

(See how to automate reCAPTCHA solving.)

If it somehow bypasses that, there’s still the Indeed human verification page powered by Cloudflare:

These failures are especially common if you run the script on a server or in headless mode—which, let’s be honest, is exactly what you want. No one wants a machine tied up for minutes while it handles a task! 😣

So yeah, all this sets up building an AI agent that fails… just like all the others 😢. Was that a waste of time? Nope, as the tutorial isn’t over yet!

There’s still the most important step. The one that actually makes this whole thing work. 🤩

Step #8: Integrate Agent Browser

Your agent fails because the sites it interacts with can detect it as an automated bot. How does that happen? Tons of reasons, including:

Browser fingerprinting: The browser session created by default in Playwright is super generic and doesn’t look like a real user.
Rate limiters: Your agent ends up making too many requests in a short time (classic for automation, not humans), which triggers suspicion instantly.
IP reputation : The more automation scripts you run from your IP, the more solutions like Cloudflare flag you as a potential bot—increasing the chances of a CAPTCHA or other verification.

So, what’s the solution? A browser that:

Runs human-like sessions, mimicking real user behavior.
Can solve CAPTCHAs automatically if they appear.
Integrates with a proxy network with millions of rotating IPs to avoid rate limits.
Runs in the cloud for infinite scalability.
Integrates seamlessly with AI.

Is this a dream? Nope! It exists, and it’s called Agent Browser (aka Browser API)!

https://www.youtube.com/watch?v=T59GCkpk5zY&embedable=true

Follow the official Agent Browser integration guide, and you’ll end up on a page like this:

Copy your connection URL (highlighted in red) and add it to your .env file like so:

BRIGHT_DATA_BROWSER_AGENT_URL=<YOUR_AGENT_BROWSER_URL>

Then, read it in agent.py and define the Browser object to instruct Browser Use to connect to the remote browser:

import os
from browser_use import Browser

BRIGHT_DATA_BROWSER_AGENT_URL = os.getenv("BRIGHT_DATA_BROWSER_AGENT_URL")
browser = Browser(
    cdp_url=BRIGHT_DATA_BROWSER_AGENT_URL
)

Next, pass the browser object to your agent:

agent = Agent(
    llm=llm,
    task=task,
    browser=browser,  # <---
)

Your AI agent will now execute tasks in remote Agent Browser instances, while no longer being blocked or interrupted. What a clutch! 🏆

Put It All Together

Your final agent.py should contain:

from browser_use import ChatGoogle, Agent, Browser
from dotenv import load_dotenv
import json
import os

# Read the environment variables from the .env file
load_dotenv()

# The LLM powering the AI agent
llm = ChatGoogle(model="gemini-2.5-flash")

# The task the AI agent will do on your behalf
task = """
Search on Google for software engineer jobs in New York.
1. Choose a job posting page.
2. On the chosen site, filter for jobs published within the last 24 hours.
3. For each job listing, extract the key details, including the job posting URL and the apply URL (if available).
4. Return all results as a JSON list.
"""

# Read the Bright Data Browser Agent CDP URL from the env
BRIGHT_DATA_BROWSER_AGENT_URL = os.getenv("BRIGHT_DATA_BROWSER_AGENT_URL")
# Configure a remote browser
browser = Browser(
    cdp_url=BRIGHT_DATA_BROWSER_AGENT_URL
)


# Define an AI agent to perform the task in the configured browser
agent = Agent(
    llm=llm,
    task=task,
    browser=browser,
)

# Execute the AI agent
history = agent.run_sync()

# Export the found jobs to a JSON output file
output_data = history.structured_output
with open("jobs.json", "w", encoding="utf-8") as f:
    json.dump(output_data, f, ensure_ascii=False, indent=4)

Test it by running it with:

python agent.py

As you can see from the GIF execution you can generate from Browser Use (perfect for debugging 🐛), the AI agent can now access Google, then Indeed, and filter jobs using the required criteria (posted in the last 24 hours):

The result will be a jobs.json file in your project folder:

This file contains all the job data extracted from Indeed, ready for you to apply for:

[
  {
    "job_title": "Software Engineer",
    "company": "Twitch Interactive, Inc.",
    "location": "New York, NY",
    "salary": "$99,500 - $200,000 a year",
    "employment_type": "Full-time",
    "benefits": [
      "Parental leave",
      "401(k)",
      "Health insurance",
      "Paid time off",
      "Employee discount",
      "Vision insurance"
    ],
    "apply_url": "https://www.indeed.com/rc/clk?jk=d57f1f5ae2ce39b2&bb=KSTlUgVEMf-eBJjV36L3azapF2zEi4bBvUN2hIAcYXrYbXRZ5eWSuITPoUpo_Z8dlLX2UOM82XGDxHt0-Ahisofl6e8m0YvqC6Hh37bUv4Ph18Wp4oM2lqjW0jgm6q24kmXmCEOn4ZCXxMbVvGx1Lw%3D%3D&xkcb=SoAR67M3sAK4p3SDqh0LbzkdCdPP&fccid=fe2d21eef233e94a&vjs=3"
  },
  // other job postings omitted for brevity...
  {
    "job_title": "Fullstack .NET Developer, Analyst",
    "company": "MUFG Bank, Ltd.",
    "location": "Hybrid work in Jersey City, NJ 07302",
    "salary": "$87,000 - $123,000 a year",
    "employment_type": "Full-time",
    "benefits": [
      "Tuition reimbursement",
      "Paid parental leave",
      "Parental leave",
      "Health insurance",
      "Retirement plan",
      "Paid holidays"
    ],
    "apply_url": "https://www.indeed.com/rc/clk?jk=88f53bba78bb73d9&bb=KSTlUgVEMf-eBJjV36L3a5W1vAjJi2KOYfFuFmAdZolzMxeST7LmPwBH3Nh_N5WyZz05vH6_vGPa9dHkj6jgfo9yTQnbXCmfxYezDirnxuSYqjnNthL3s5UtUFYUkLK_DbCh8F545E0wDidVKUnxVQ%3D%3D&xkcb=SoBM67M3sAK4p3SDqh0FbzkdCdPP&fccid=3b98171e4a0fd997&vjs=3"
  }
]

Wow! 😲 In around 40 lines of code, you just built an AI agent that can automate virtually any browser task for you! (Want some ideas? Hang tight for a few more minutes and check out the next chapter.)

If you want to level up 🆙, you can even integrate it with logic to read your CV and apply for positions automatically, as shown in the official Browser Use example on GitHub.

Thanks to Bright Data's Agent Browser integration in Browser Use, you can now craft an unstoppable AI agent that handles all the boring tasks that drain your time and energy. The AI agent revolution is now!

Examples of Boring Tasks You Can Automate with This Agent

Want some ideas for tasks and chores this AI agent can handle? Check these out:

Find and schedule flights ✈️: Let the AI search for flights, compare options, and even book tickets based on your preferences.
Extract weather data for multiple cities 🌤️: Get real-time weather info for all the cities you’re traveling to, so you’re always prepared.
Schedule calls for you 📅: Rely on Calendly or similar tool, and the AI will arrange meetings according to your availability.
Track Amazon product prices and buy at low 💰: Monitor product prices and automatically purchase items when they hit your target price.
Collect news headlines 📰: Gather and summarize the latest news from multiple sources, so you don’t miss anything important.
Buy groceries for you 🛒: Provide a shopping list, and the AI will automatically purchase your groceries online, saving you time.

Want more ideas? Discover other AI agent use cases and scenarios.

Final Thoughts

Now you know how to build an AI agent that tackles boring, repetitive, dull, and time-consuming browser tasks for you.

That wouldn’t be possible without Browser Use, one of the coolest AI agent libraries out there—but the real game-changer is Bright Data’s Agent Browser, which gives your AI unstoppable, agent-ready cloud browser instances.

At Bright Data, our mission is simple: make AI accessible for everyone, everywhere—even for automated users. Until next time, stay bold, and keep building the future of AI with creativity. ✨