How to Build an LLM-Powered CLI Tool in Python

Written by ksurya220 | Published 2025/12/08
Tech Story Tags: ai-developer-tools | ai-terminal-tools | llm-cli-assistant | command-line-automation | bash-script-generation-ai | terminal-log-analysis | python-realtime-api | llm-cli-tool-in-python

TLDRThis tutorial shows how to build a real-time, AI-powered command line tool in Python using OpenAI’s Realtime API. You’ll create llm-explain, a CLI utility that explains any shell command by streaming LLM responses directly into your terminal. The guide covers setting up a WebSocket client, handling streaming output, and extending the tool with optional “AI agent” capabilities like tool-calling and safe shell execution. By the end, you’ll have a reusable framework for building your own AI-native CLI assistants.via the TL;DR App

Why AI Belongs in the Terminal

Developers spend a huge chunk of their time in the terminal like running commands, reading logs, debugging scripts, working with git, managing servers, and automating tasks.


But the terminal is also unforgiving:

  • You must know the right flags
  • You must remember syntax
  • You need context for errors
  • Debugging often involves trial-and-error
  • Scripts quickly become unmanageable


Since LLMs excel at explanation, transformation, and reasoning, the CLI is a perfect environment for AI augmentation.


Imagine tools that can:

  • Explain complex commands and pipelines
  • Suggest safer alternatives
  • Read and summarize logs
  • Generate bash scripts on the fly
  • Fix broken git commands
  • Walk you through debugging steps
  • Serve as an “AI man page”


In other words, AI can make the terminal friendlier, smarter, and a lot more powerful.

How to Bring AI-Native Interactions Directly Into Your Terminal

The developer terminal hasn’t changed much in decades. It’s still a fast, scriptable, text-based interface designed for humans who know exactly what they’re doing. But what if your terminal could help you? What if the CLI itself could explain unfamiliar commands, auto-correct mistakes, generate scripts, reason about logs, or even execute actions with intelligence?


In this tutorial, we’ll build an LLM-powered CLI assistant using Python, the Realtime API, and a lightweight terminal UI. Our sample tool, called llm-explain, lets you type any shell command and get a real-time explanation streamed directly in your terminal. The experience feels like ChatGPT running natively inside your CLI.


This article covers:

  • How the OpenAI Realtime API works
  • Why it’s ideal for CLI tooling
  • Step-by-step implementation
  • Complete working Python example
  • Optional tool-calling (agents that can take actions)
  • Ideas for more advanced tools


What Is the OpenAI Realtime API?

The Realtime API is a WebSocket-based interface that provides:


a) Low-latency token-by-token streaming: Great for CLI output where you want text to appear naturally.


b) Event-driven communication: You can send and receive events such as:

  • input_text
  • response.output_text.delta
  • response.completed
  • response.tool_call

This enables multi-turn conversations and dynamic behaviors.


c) Built for interactive apps: Unlike the classic REST API, Realtime APIs are optimized for IDE assistants, Terminals, Real-time agents, Live coding or Voice interfaces


d) Optional "tool calling": Tools let you define functions the model can request, enabling command execution, file manipulation, queries, retrieval or anything your Python program can do


This is extremely powerful and makes the model feel alive.

Project Overview: Building llm-explain

Our example tool mimics a smart, AI-powered version of man pages, i.e,


You run:

python explain.py "tar -xzf backup.tar.gz -C /tmp"


And the system streams back:

This command extracts (-x) a gzip-compressed archive (-z)
from backup.tar.gz into the /tmp directory (-C /tmp). The -f 
flag specifies the archive file.

All streamed live, token by token.


The project is tiny but demonstrates the full power of the Realtime API.

Project Structure

llm-explain/
 ├── client.py
 ├── explain.py
 └── README.md


Two files:

  • client.py: a small wrapper for connecting to the Realtime WebSocket
  • explain.py: our command line interface

Step 1: Implement the Realtime Client

Create client.py:

# client.py
import asyncio
import websockets
import json
from openai import OpenAI

REALTIME_URL = "wss://api.openai.com/v1/realtime?model=gpt-4.1-realtime"

class RealtimeClient:
    def __init__(self, api_key):
        self.api_key = api_key

    async def connect(self):
        self.ws = await websockets.connect(
            REALTIME_URL,
            extra_headers={"Authorization": f"Bearer {self.api_key}"}
        )

    async def send_event(self, event):
        await self.ws.send(json.dumps(event))

    async def listen(self):
        async for msg in self.ws:
            yield json.loads(msg)


This class:

  • Establishes a WebSocket connection
  • Sends events to the model
  • Returns events as they’re streamed


This is the entire “real-time engine” powering the CLI.

Step 2: Create the CLI Tool

Now, create explain.py:

# explain.py
import asyncio
import argparse
import os
from rich.console import Console
from client import RealtimeClient

console = Console()

async def explain_command(command):
    api_key = os.getenv("OPENAI_API_KEY")
    if not api_key:
        raise RuntimeError("Set OPENAI_API_KEY environment variable.")

    client = RealtimeClient(api_key)
    await client.connect()

    # Send user prompt
    await client.send_event({
        "type": "input_text",
        "text": f"Explain what this command does:\n\n{command}"
    })

    console.print(f"[bold green]🔍 Explaining:[/bold green] {command}\n")

    # Stream output in real time
    async for event in client.listen():
        if event["type"] == "response.output_text.delta":
            console.print(event["delta"], end="")
        elif event["type"] == "response.completed":
            break

def main():
    parser = argparse.ArgumentParser(description="Explain any CLI command using LLMs.")
    parser.add_argument("cmd", type=str, help="Command to explain")
    args = parser.parse_args()

    asyncio.run(explain_command(args.cmd))

if __name__ == "__main__":
    main()


This script:

  • Reads the command passed via CLI
  • Sends the message through the Realtime API
  • Displays the model’s response as a live stream


This gives developers an AI-native terminal experience.

Step 3: Run the Tool

Set your OpenAI key:

export OPENAI_API_KEY="your_key"


Explain any command:

python explain.py "git rev-list --count HEAD"


Example output (streamed):

🔍 Explaining: git rev-list --count HEAD
This command counts how many commits exist in the current branch up to HEAD. The --count flag returns the numeric total instead of listing individual revisions.


The result is fast, fluid, and extremely helpful when you’re unsure what a command does.

Step 4: Optional — Add Tool Calling (AI That Executes Commands)

You can expose functions that the model can call.


Define a tool:

tools = [
    {
        "name": "run_shell",
        "description": "Execute a shell command",
        "parameters": {
            "cmd": {"type": "string"},
        }
    }
]


Listen for tool calls:

elif event["type"] == "response.tool_call":
    if event["name"] == "run_shell":
        output = subprocess.getoutput(event["args"]["cmd"])
        await client.send_event({
            "type": "tool_output",
            "content": output
        })


Important: Only allow safe, sandboxed execution - especially for multi-user systems.


But once sandboxed, this unlocks:

  • llm-git that automatically fixes your errors
  • llm-logs that identifies failure patterns
  • llm-devops that applies infrastructure changes
  • llm-shell where the model becomes your command runner


This is where things get insanely powerful. With this pattern, developers can build a whole ecosystem of AI CLI assistants. Here are real projects you can build:


1) AI Man Page 2.0

Ask questions like: llm-help "What is the difference between grep -r and grep -R?"


2) Git Doctor

Automatically fix common git issues: llm-git "help me resolve this merge conflict"


3) AI Log Debugger

Paste logs to get root cause analysis.

Conclusion

The CLI has always been one of the most powerful environments for developers but also one of the least accessible. With the OpenAI Realtime API, it’s now possible to bring AI directly into that workflow in a natural, real-time, low-latency way.



Written by ksurya220 | A seasoned software engineer with over 15 years of experience developing scalable web & enterprise applications
Published by HackerNoon on 2025/12/08