Last week, a post about my open-source project CocoIndex Code hit 54K+ views on X after @RoundtableSpace shared it. The tweet was simple: "CocoIndex Code gives your coding agent a brain." That one line captured exactly what we built and why it matters.
The problem is straightforward. Every time your AI coding agent needs context about your codebase, it pulls in entire files. Function signatures, import statements, docstrings, blank lines, comments you wrote at 2am — everything gets stuffed into the context window. You burn tokens and the agent slows to a crawl.
So I built an AST-based MCP server that fixes this.
CocoIndex Code is a lightweight, embedded MCP (Model Context Protocol) server that gives coding agents semantic understanding of your codebase. Instead of dumping raw files into the context, it uses Abstract Syntax Tree parsing to break code into meaningful chunks — functions, classes, methods — and lets agents search by meaning, not just text.
The result: 70% fewer tokens consumed, faster responses, and agents that actually find the right code on the first try.
It works with Claude Code, Codex, Cursor, and any MCP-compatible agent. One-line setup: claude mcp add cocoindex-code -- cocoindex-code. No database. No config files. No API keys.
Developers are frustrated with how much context coding agents waste. When you're paying per token or waiting for slow completions, efficiency isn't optional — it's the whole game. The viral tweet by @RoundtableSpace (54K+ views, 82 likes, 33 bookmarks) resonated because it listed exactly what people were looking for: semantic search across the entire codebase, AST-based code understanding, incremental re-indexing, local execution with no API key, and 70% token savings.
The Technical Architecture
Here's where it gets interesting. CocoIndex Code is built on top of CocoIndex, a Rust-based data transformation engine. The architecture has four key layers.
AST Parsing with Tree-sitter
We use tree-sitter grammars to parse source files into ASTs. Tree-sitter is an incremental parsing library that builds concrete syntax trees — the same tech powering syntax highlighting in Neovim and Zed. When a file is indexed, tree-sitter breaks it into structural nodes: function definitions, class declarations, method bodies. Each node becomes a chunk with its file path, language, line range, and content. A 200-line file might become 8-12 semantically meaningful chunks instead of arbitrary 500-character windows. We support 25+ languages including Python, TypeScript, Rust, Go, Java, C/C++, Kotlin, Swift, and more.
Embedding and Vector Search
Each AST chunk gets embedded into a vector. By default, we use a local SentenceTransformers model (all-MiniLM-L6-v2) — zero API cost, no external calls, no rate limits, no data leaving your machine. For teams wanting better code-specific results, we support models like nomic-ai/CodeRankEmbed (137M params, ~1GB VRAM) or any of 100+ providers via LiteLLM including OpenAI, Gemini, Voyage, Cohere, and Ollama. Vectors are stored in an embedded SQLite database with the sqlite-vec extension for vector similarity search. No Postgres, no Pinecone, no external services. Everything lives in a .cocoindex_code/ directory in your project root.
Incremental Indexing via CocoIndex Engine
This is the performance secret. CocoIndex's Rust engine tracks file changes and only re-indexes what's modified. On a large codebase, the initial index might take 30-60 seconds. After that, updates are near-instant because unchanged files are skipped entirely. The MCP server exposes a single search tool that accepts a natural language query or code snippet, with parameters for limit, offset, and whether to refresh the index before searching. Results come back with file path, language, code content, line numbers, and a similarity score. The agent gets precisely what it needs.
Why AST Beats Naive Chunking
Consider what happens when you naively split a Python file every 500 characters. You might cut a function in half. You might lump two unrelated functions together. The embedding captures noise, and search results are mediocre. AST chunking guarantees that every chunk is a complete semantic unit. A function is a function. A class is a class. The embeddings are cleaner, the search is sharper, and the agent wastes fewer tokens on irrelevant context.
On a real-world security audit task, with CocoIndex Code the agent completed in 34 seconds using 652 tokens. Without it: 54 seconds, 1.8K tokens, and the agent still asked follow-up questions. That's not a synthetic benchmark — that's actual agent behavior on a real codebase.
CocoIndex Code is open source under Apache 2.0. We're actively working on enterprise features for large monorepos — index sharing across teams, branch deduplication, and remote setups. The core engine (CocoIndex) is also open source and powers the whole pipeline.
If you want to try it, two commands is all it takes: pipx install cocoindex-code, then claude mcp add cocoindex-code -- cocoindex-code. Your agent now understands your code.
GitHub: https://github.com/cocoindex-io/cocoindex-code
The viral X post: https://x.com/RoundtableSpace/status/2031366453153157139
