MCP Is Dead. The CLI Is Winning the AI Agent Stack

Written by fmind | Published 2026/03/26
Tech Story Tags: artificial-intelligence | large-language-models | software-development | product-management | cloud-computing | cybersecurity | ai-agent-tooling | cli-for-ai-agents

TLDRWhy developers are ditching bloated agent protocols and turning to the CLI as the most practical foundation for building AI agents in 2026.via the TL;DR App

We are building the most advanced AI systems in history, yet the best way to control them relies on paradigms from the 1970s. I see developers wrestling with this friction daily, trying to bridge the gap between bleeding-edge models and legacy toolchains.

As of 2026, a simpler and clearer trend is emerging: the Command Line Interface (CLI) is becoming the most practical foundation for building agents. Tools like OpenClaw and the Google Workspace CLI demonstrate this well. Handing an agent a raw shell and a file system is often faster and more reliable than wrapping it in a complex protocol. As Eric Holmes noted in MCP is dead. Long live the CLI, modern LLMs already excel at using standard command-line utilities. These tools are lightweight, trivial to debug, and compose naturally.

To understand why this approach works, we need to look at where agent protocols struggled, and why the path forward means returning to the Linux philosophy.

The Problem with MCP (Model Context Protocol)

Model Context Protocol (MCP) was one of the first major agent protocols to emerge. Heavily inspired by how VS Code extensions work, its goal was noble: create a standardized way to expose all available resources and tools to an LLM.

In practice, however, MCP suffers from severe architectural flaws for everyday agentic workflows:

  • Context Bloat: MCP typically exposes all tools and their schemas at once. Massive tool descriptions eat up valuable context windows. On my Gemini CLI, simply booting up and saying “Hello” cost 50k input tokens.
  • Deployment Mess: The protocol tries to handle too many scenarios. It combines local runners (e.g., Docker, uvx, npx, …), remote execution, API keys, and OAuth into a single, often fragile deployment model.
  • Redundant Authentication: As Holmes pointed out, you often have to re-authenticate specifically for the MCP and define an entirely new set of tools, needlessly duplicating your existing environment.
  • Verbose Returns: Results are returned directly in the protocol’s format to the LLM, bloating the context even further with massive JSON structures rather than clean, human-readable text.

🛠️ MCP: Experience from the Field

  • Development: Personally, I don’t use any MCPs on my local machine anymore. It’s too bloated. I refuse to re-authenticate to yet another tool wrapper, and micromanage them by manually selecting which tools the agent should use for a given task.
  • Production: I often tell my team that MCPs are like the old regex joke: You had a problem that required MCP, now you have two problems. They lack maturity, and it’s rarely clear if a specific MCP was built for a slick local dev experience or for robust production. They are incredibly hard to debug, and I frequently find myself monkey-patching them just to keep pipelines running smoothly.

A2A (Agent-to-Agent) Is Not a Tooling Protocol

For a long time, I believed Agent-to-Agent (A2A) communication would emerge as the compelling alternative to tool-binding protocols like MCP. While A2A remains the best way to orchestrate agents across different domains, it is not a protocol meant for low-to-medium level tool execution:

  • The Double-Token Tax: If you have a client agent delegating a simple task to a remote agent via A2A, you are paying token costs and incurring latency for two LLMs in the loop. For low-level execution, the cost far outweighs the benefits.
  • Rigid Abstractions: The remote agent comes with its own baked-in instructions and tools. While great for isolation, it’s highly detrimental if the local client agent needs to dynamically override or tweak how the remote task is executed.
  • Network Overhead: A2A inherently treats everything as a remote network call. This is powerful for distributed systems, but absolute overkill for local development tasks.

🛠️ A2A: Experience from the Field

  • Development: I don’t know anyone actually using A2A in their local development loop. Providers like GitHub tend to build their own proprietary APIs to call remote agents. When you operate inside a closed ecosystem, it’s often faster and easier to use native APIs than to force-fit an external protocol.
  • Production: I think A2A is incredibly promising here, and we already leverage it in production. It’s a fantastic approach for large organizations where you need to mirror complex corporate structures, allowing agents to delegate high-level tasks to specialized “worker” agents at scale.

Agent Skills Are Too Abstract

I previously thought Agent Skills were going to be the golden hammer — a concept I explored in MLOps Coding Skills: Bridging the Gap Between Specs and Agents. I was wrong.

  • Premature Abstraction: Skills are designed to be reusable. The trap is making them overly verbose on Day 1, trying to account for edge cases that don’t exist yet.
  • Context over Skills: On a new project, you are much better off using an AGENTS.md file, system instructions, or context files like the 5XP framework. They are vastly easier to operate and tweak.
  • Incomplete Taxonomy: Structuring everything as a “skill” is a flawed mental model. Not everything is a skill. Way-of-working, product objectives, and business context cannot be neatly packaged into executable skills.
  • The “Static Fit” Problem: When you import open-source skills or skills written by others, they rarely align 100% with your needs. Because they are static, they either fit perfectly, or they break. There is no inherent mechanism for the skill to learn or adapt to the user’s specific workflow.

🛠️ Agent Skills: Experience from the Field

  • Development: I use the 5XP framework now. I only extract a “skill” a posteriori — after a major refactoring, when I actively want to reuse a specific workflow across multiple projects.
  • Production: Skills provide a clean way to formalize and version-control agent instructions to share among colleagues. They complement A2A beautifully: the A2A protocol provides the workers, while the Skills repository provides their standardized instructions.

Rediscovering the Linux Philosophy

If bloated protocols and rigid abstractions are slowing us down, what is the alternative? We need an approach to agent tooling that is dynamic, composable, and lightweight. Unsurprisingly, the industry is circling back to the CLI (Command-Line Interface).

For developers, treating the CLI as the primary interface for agents has undeniable benefits:

  • Battle-Tested Maturity: There are thousands of mature, edge-case-tested CLI tools already on the market. Furthermore, LLMs have ingested the man pages for these tools; they already know exactly how to use them.
  • Shared Environment: A human can step in and run the exact same command. The environment is already authenticated, and debugging is trivial (standard error messages, exit codes, easily readable flags).
  • Zero Bloat: It’s incredibly lightweight. You don’t need to deploy a wrapper or a daemon. You just execute the command.
  • Infinite Composability: Agents can natively pipe find, grep, jq, and curl together, without passing the intermediate tool output to the LLM.
  • Language Agnostic: It doesn’t matter if the underlying tool is written in Rust, Go, Python, or Bash. To the agent, it’s just a command.

What we are doing is rediscovering the Linux Philosophy:

  1. Everything is a file (or a text stream).
  2. Write small tools that do one thing well.
  3. Combine them to solve complex problems.

🛠️ Experience from the Field

  • Development: During my day-to-day development, I now rely almost entirely on CLI tools combined with raw text context. It is incredibly liberating. I’m deeply curious to see how far this pure approach can scale. We are even seeing frameworks adapt to this reality; for instance, the Google ADK recently added native bash tools to give agents direct shell access.

Side Note: I finally feel less guilty about the 1283+ commits on my dotfiles.

What’s Next? From Linux to Kubernetes for Agents

Right now, the raw CLI is the 80/20 rule of agent development. It gives us maximum leverage with minimum setup, accomplishing 80% of what we need and making developers extremely productive locally. But local development is not the final destination.

While giving an agent CLI access is like giving it a personal UNIX terminal, security and scale are the ultimate blockers for the CLI in production.

You cannot simply hand an autonomous LLM unconstrained bash access to your AWS account or your production database. Real-world deployment requires a rigorous security layer: strict authentication, Role-Based Access Control (RBAC), sandboxing, and immutable audit logs. This is a highly complex problem that raw CLI execution cannot solve safely at an enterprise scale.

In traditional software, we didn’t abandon the Linux philosophy to build the cloud; we containerized and orchestrated it. We bridged the gap between a single bash instance and a globally distributed system.

We need the exact same evolution for AI. To bridge the gap between heavy, bloated agent protocols and the lightweight-but-insecure raw CLI, we need a new paradigm. We need something that adheres strictly to the composability of Linux, but is built for autonomous systems at scale.

My bet is that we don’t just need another protocol; we need an Operating System for Agents.

Just as the industry needed Kubernetes to safely orchestrate Linux containers across vast server networks, our agents will need an orchestration layer built specifically for AI workloads:

  • Process Management: Spinning agent loops up and down dynamically based on computing load.
  • Granular RBAC: Strict, declarative policies restricting exactly which binaries, data silos, and network endpoints an agent can touch.
  • Dynamic Service Discovery: An Istio-like service mesh for agents, allowing them to route tasks to specialized peers without hardcoded endpoints.
  • Message Queues: Enabling agents to share state and context asynchronously, freeing up expensive context windows.

The next frontier is building the Kubernetes to run it.


Written by fmind | Freelancer • AI/ML Architect & Engineer • AI Agents & MLOps • GCP Professional Cloud Architect
Published by HackerNoon on 2026/03/26