Suppose you take away the noise around the words “AI agent” and “LLM”. In that case, the point remains the same: we are once again trying to create programs that can not just follow instructions, but understand what is wanted from them and achieve the goal, even if the explicit algorithm is not defined in advance.
Businesses today live with variable goals, incomplete data, and demands to act and make decisions as quickly as possible. That's why a new architecture is emerging: AI agents that don't just react, but can independently develop a logic of actions, adjust their tactics, utilize tools, and remember what has already been tried.
How are such agents internally organized? Why, without architecture, do they turn into a regular chatbot? And where do they really work today - not in demos, but in production?
Let's look at examples: from LangGraph and CrewAI architecture to the practice of implementation in ML projects and finance.
From rule-based systems to LLM agents
Over the past year, AI agents have become one of the most talked-about topics in the industry. But if you take away the hype, the idea itself is not new at all. The first attempts to create program “executors” with minimal autonomy were made back in the 1980s. Back then, we were talking about expert systems: they worked on predefined rules (IF-THEN), and knowledge was entered manually. Effective? Yes, if the conditions are stable. Flexible? Not in the slightest.
Today's business challenges include variable goals, incomplete data, and unexpected exceptions. The rule “if X, do Y” will not work here. This scenario causes a growing demand for a system that is better system. Not just better than its historical predecessors. Better than just adding another human employee into the workflow and deepening the routine.
From this perspective, a modern AI agent is indeed helpful, as it's not just a rule-based bot but an autonomous decision-making system that has:
- a goal (and the context in which it changes);
- an architecture that allows for state persistence;
- access to external data, APIs, models;
- logic for planning and re-evaluating the outcome.
To better understand how this approach works, let's break down a few real-world architectures.
- Microsoft's AutoGen Studio offers a visual approach without programming, where each agent is a role with its own memory, well-defined tasks, and toolset. This is especially useful for business users who want to create an agent system without deep technical knowledge.
- LangGraph goes further and allows you to build complex chains of interaction where agents can share context, pass results to each other, and work in parallel processes or cyclic operations. This framework opens up the possibility of building truly complex systems where agents can coordinate their actions at a high level.
- CrewAI offers an interesting approach, where each agent receives not just a function, but an entire “personality” - a specific role, professional background, and specialized tools. At the same time, the interaction between agents follows a predefined scenario: they pass tasks to each other, clarify intermediate results, and jointly plan the next steps.
What makes an agent a really working tool and not just a technological demonstration?
At the level of engineering implementation, a quality agent is far more than just a language model with a properly written prompt. It is a modular system where the model itself is only one of the components, and the other elements are no less critical for successful operation.
Context becomes the memory of the system - the history of previous requests, decisions made, intermediate steps, and their results. Without this component, the agent will constantly “forget” what it has already tried, what approaches were ineffective, and will repeat the same mistakes.
The planner acts as a strategic center that chooses the optimal strategy for solving a particular problem. It decides whether to use the ReAct, Plan&Execute, or Function Agent approach, and it is critical when the agent must independently break down a complex task into manageable subtasks.
The tool layer provides connectivity to the outside world - APIs, databases, search engines, data parsing, and code generation. A language model without tools remains just a chatty interlocutor, but with properly configured tools, it turns into a real executor.
Executor manages execution - it can execute chains of steps sequentially or in parallel, optimizing performance and ensuring process reliability.
Output Parser translates the model answers into a system-understandable structure and helps to determine what to do next, and it is this factor that is critical for automating decision-making processes.
Why is architecture proving to be more important than a perfect prompt?
Because if you just "feed" a GPT with a list of instructions, it's not an agent. A real agent system should be able to work without a constant direct request from a human, independently evaluate intermediate results, and decide whether to continue the current plan, change strategy, engage another tool, or signal an error. And most importantly, it must keep a history of its actions and learn from its own mistakes.
Based on the analysis of systems, examples, and mistakes, the structure of a ‘good agent’ was formed:
- Goal: clearly defined, possibly variable.
- Instruction/protocol: flexible, adaptive.
- Context/memory: history of actions, results, conclusions.
- Tools: APIs, code functions, knowledge bases.
- Quality assessment: feedback loop logic.
Agents are not an LLM with a prompt. It is a system.
What does the basic agent code look like?
Using LangChain as an example, showed how a simple agent can be created from a few dozen lines of Python code. One of the examples:
This code creates an agent that can independently search for news on the Internet and generate answers. This is a simple example, but it illustrates the key idea: an agent is LLM + tools.
Here we can see CrewAI, a framework that allows you to build multi-agent systems. In CrewAI, each agent has:
- role,
- goal,
- backstory,
- a set of tools,
- and even a personality prompt.
Example of implementation:
How does it work?
In the CrewAI system, agents can work both sequentially and simultaneously. For example, one agent researches a topic, and another writes based on the results. They can share context, iteratively refine tasks, and adapt to changes.
Typical mistakes when working with agents:
- Lack of consistency: if you don't think through the logic of agent interaction, they quickly break down.
- Overcomplication: many agents are not better. Sometimes, a simple pipeline is more effective.
- Failure to consider memory: without a well-configured Memory, the agent starts to ‘forget’ the context.
- Improper use of tools: An agent can make dozens of API requests without any real effect if the logic is not optimized.
What does it mean to integrate an agent into a business process?
An AI agent doesn't create value by itself. Its power lies in the ability to work with real systems, understand the context, act without constant human supervision, and save resources.
In Data Science UA's practice, we observe that agent architecture begins to deliver ROI when it is integrated into daily operations. For example:
- In finance, the agent reconciles invoices, detects anomalies in payments, and generates alerts for the accounting department. In cases with 10,000-15,000 transactions per month, this means up to $ 15,000-$20,000 in savings on checks alone.
- In support services, instead of a classic bot, the agent retrieves the customer's history from the CRM, checks the status of transactions in the database, and tailors the response accordingly. The average response time decreases from 3 minutes to 45 seconds, and customer satisfaction (CSAT) increases by 18%.
- In DevOps, the agent monitors service statuses, recognizes deviations in logs, and initiates a restart or an alert. The system responds within 5-10 seconds after the anomaly occurs.
How agents optimize business processes
| Industry | Typical process | Classic execution | Agent implementation | Effect | 
|---|---|---|---|---|
| Finance | Invoice reconciliation | An accountant manually checks the PDF | Agent reads, parses, and compares with ERP | -80% of the time, reduction of human errors | 
| Support | Answers to queries | FAQ bot, manual escalation | Agent checks status, pulls up context, and gives answer | +18% CSAT, -60% human workload | 
| DevOps | Service monitoring | Monitor + engineer | Agent analyzes metrics, responds, and triages | -50% of incidents, -70% of time to alert | 
| HR | Recruiting | Recruiter analyzes resumes without any help | Agent evaluates candidate's path, generates recommendations | +30% of relevance, - 40% of time spent | 
| Sales Enablement | Preparation of commercial offers | The manager collects data from CRM, generates an offer | Agent extracts data, forms a structure, and adapts the text | - 3 hours/day per salesperson | 
Agents that help ML-explorers
Most conversations about AI agents today are about customer service, process automation, or text generation. But beyond the headlines about chatbots and marketing use cases, something much deeper is happening: agents are starting to change the way machine learning teams work themselves. And we're not talking about “code assistants”, but about an intelligent infrastructure that transforms the way we research, analyze, and manage projects.
Context is the main currency!
Anyone who has ever worked in an R&D team knows that most of the time is spent not on modeling, but on restoring context. It takes hours to go back to the code that was written two weeks ago, find out where the errors were, understand which hypotheses have already been tested, and restore the chain of changes. Especially in large projects where several architectures are tested simultaneously and the data pipeline is constantly updated.
This is where an AI agent turns out to be both useful and critical. If the system has access to the entire history of the project - code, logs, architectural decisions, communications between participants - it can quickly recreate the picture of the research. Such an agent helps to form the overall picture without unnecessary Slack messages or meetings. It will keep in mind what the team has long forgotten.
Not code generation, but knowledge creation
Unlike classical agents that respond to requests such as “write a clustering function”, modern ML assistants solve problems of a completely different level. They do not just generate code but understand the project structure, see weaknesses, and form hypotheses.
How does it work? For example, the team is testing a new customer segmentation for a banking product. The agent sees that recent experiments have yielded unstable results and suggests options, such as checking how the data structure has changed over the past month or whether new attributes have affected the correctness of the clusters. In response, it doesn't give general advice, but creates a ready-made pipeline with checks, adds explanations, and compares the results with previous iterations.
Tools that help in everyday work
In practice, agents aren't built from scratch. Teams integrate existing frameworks and modules. For example, one of the most popular tools today is Grok DeeperSearch, which allows an agent to conduct deep semantic searches not only in documents but also in technical manuals, competitors' models, and Jira archives. It saves hours that would normally be spent analyzing third-party sources.
Another example is Manus AI. With this tool, an agent can automatically generate technical documentation, describe the code structure, and suggest optimization options. For a team that works at a frantic pace, this means that there is no need to write reports manually anymore. It's sufficient to have a structured folder with the results - the agent will compile a summary, identify the most important points, and prepare materials for demonstration to the customer or stakeholders.
Working with code is a new reality
Thanks to advanced LLM integrations, agents can not only read code but also analyze it as a living structure. They track connections between modules, see the logic of changes, and detect repetitions. For example, when several people in a team are working on similar features at the same time, the agent will tell you that there is already an implementation with similar logic and avoid duplication of work.
Moreover, agents “memorize” the team's code style. They form a unified design structure, suggesting when and where to put logic in separate classes or apply templates. This is not just ESLint or black. This is context-sensitive support for the style and quality of the entire project.
Organizational benefit: time and focus
Even if an agent isn't able to write a complex model better than a human, it already saves dozens of hours. There is no need to spend time searching for files, testing preliminary hypotheses, communicating between teams, or manually updating documentation. The agent takes care of all this.
The result? The researcher focuses not on technical noise but on building new hypotheses. The team works smoothly. Cognitive load is reduced. The chaos in notes and chains of correspondence disappears. And most importantly, there is a sense of control over the process, even if the project is moving quickly and in parallel in several directions.
So, is it better to implement AI agents today?
The true potential of such agents will be revealed when they can lead the project not just as technical assistants, but as meta-managers who see the big picture. They will know what experiments have already been conducted, what conclusions have been drawn, what worked, and what didn't. They will initiate new iterations themselves. And not just based on prompt requests, but by understanding the goals of the team and the business.
It requires two key components: a stable environment architecture and discipline in project management. Without them, an agent is just a bot. With them, it's a real researcher, only without fatigue, vacations, and subjective mistakes.
