I Spent 48 Hours Responding to the LiteLLM Supply Chain Attack. Here Is Everything I Know

On March 24, 2026, someone published two malicious versions of LiteLLM to PyPI. Versions 1.82.7 and 1.82.8 contained a three-stage payload that harvested SSH keys, cloud credentials, Kubernetes secrets, and cryptocurrency wallets from every machine where the package was installed.

The entire LiteLLM package is now quarantined on PyPI. No version is available for download.

I am writing this because the attack is more interesting than most supply chain incidents, as the blast radius extends far beyond people who intentionally installed LiteLLM, and the incident response steps are not obvious. If you run Python in any environment where LiteLLM might exist as a transitive dependency, this post is for you.

The Attack Chain: From Trivy to LiteLLM in Five Days

This was not an isolated event. It was the third strike in a coordinated campaign by a threat actor called TeamPCP, and the chain of events matters because it shows how a single stolen credential can cascade across unrelated open-source projects.

Here’s a timeline of events:

March 19: TeamPCP compromised Aqua Security's Trivy scanner by force-pushing malicious version tags. The compromised Trivy action was designed to exfiltrate secrets from CI/CD runners.
March 23: The same group hijacked all 35 tags in Checkmarx's KICS GitHub Actions through a compromised service account. They also hit OpenVSX extensions the same day.
March 24: Using a PyPI token that was stolen during the Trivy compromise, TeamPCP published malicious LiteLLM packages at 10:39 UTC. PyPI quarantined them around 11:25 UTC. Roughly 45 minutes of exposure.

Here is the critical detail: LiteLLM's CI/CD pipeline ran Trivy without a pinned version. When the compromised Trivy action executed inside LiteLLM's GitHub Actions runner, it grabbed the PYPI_PUBLISH token and sent it to the attacker. That token was then used to push the malicious packages directly to PyPI.

LiteLLM is an API key management gateway. The attacker targeted the one package that, by design, has access to every LLM API key in the organization. That is not a coincidence.

How the Malware Actually Worked

Version 1.82.7 embedded the payload in proxy/proxy_server.py. It activated on import. Straightforward.

Version 1.82.8 used a technique that most developers have never seen in the wild. It included a file called litellm_init.pth. Python's site module processes all .pth files in site-packages during interpreter initialization. This means the malware fired on every Python process on any machine where LiteLLM 1.82.8 was installed. Not just when someone ran import litellm. Every single Python process.

The payload used double base64 encoding:

import os, subprocess, sys
subprocess.Popen([
    sys.executable, "-c",
    "import base64; exec(base64.b64decode('...'))"
])

Once decoded, the attack ran in three stages:

Stage 1 (Credential Harvesting): The payload collected SSH keys, AWS/GCP/Azure tokens, environment variables including API keys and tokens, .env files, Kubernetes configs, Docker configs, database credentials, shell history, browser cookies, and cryptocurrency wallets. It scanned every common credential storage location on the host.

Stage 2 (Kubernetes Lateral Movement): It deployed privileged Alpine pods into the kube-system namespace on every reachable node. This gave the attacker access to cluster secrets and service account tokens for broader infrastructure compromise.

Stage 3 (Persistent Backdoor): It installed sysmon.py as a systemd service that polled checkmarx[.]zone/raw for additional binaries. This provided persistent access even after the initial malware was discovered.

All harvested data was encrypted and exfiltrated via POST request to models.litellm[.]cloud, a lookalike domain controlled by TeamPCP.

Why the Blast Radius Extends Beyond Direct LiteLLM Users

This is the part that catches people off guard.

The .pth mechanism means the malware executed on every Python process on any machine where LiteLLM 1.82.8 existed in the environment. A data scientist running Jupyter. A DevOps engineer running Ansible. All compromised if the package was anywhere in their Python environment.

The key point: you did not need to install it yourself. If another package in your dependency tree pulled LiteLLM in, the malware still executed. As reported in the GitHub issue, the researcher who discovered this attack found it because their Cursor IDE pulled LiteLLM through an MCP plugin without explicit installation.

Agent frameworks, LLM orchestration tools, and AI development environments frequently pull LiteLLM as a transitive dependency. Most developers never audit what their dependencies depend on.

How to Check If You Are Affected

Run this across local machines, CI/CD runners, Docker images, staging, and production environments:

pip show litellm | grep Version
pip cache list litellm
find / -name "litellm_init.pth" 2>/dev/null

Check Docker layer histories too. A clean pip show on the current image does not mean a previous layer did not install and then remove the package.

Scan Egress Logs for Exfiltration

Any traffic to models.litellm[.]cloud or checkmarx[.]zone is a confirmed breach:

# CloudWatch
fields @timestamp, @message
| filter @message like /models\.litellm\.cloud|checkmarx\.zone/
# Nginx
grep -E "models\.litellm\.cloud|checkmarx\.zone" /var/log/nginx/access.log

Audit Transitive Dependencies

pip show litellm  # Check the "Required-by" field

If other packages listed there are in your dependency tree, LiteLLM entered your environment without your explicit consent. That is how this attack spreads.

Incident Response Playbook

Step 1: Isolate Immediately

docker ps | grep litellm | awk '{print $1}' | xargs docker kill
kubectl scale deployment litellm-proxy --replicas=0 -n your-namespace

Stop all running LiteLLM containers. Scale down any Kubernetes deployments that use the compromised package. This halts all traffic flowing through the infected gateway immediately.

Step 2: Rotate Every Credential on Affected Machines

The malware harvested everything it could find. Every credential that was stored on or accessible from the compromised environment should be treated as known to the attacker.

Cloud Provider Tokens: AWS access keys, GCP service account keys, Azure AD tokens
SSH Keys: All keys in ~/.ssh/, regenerate and redistribute
Database Credentials: Connection strings, passwords in .env files
API Keys: OpenAI, Anthropic, Gemini, every LLM provider key
Service Account Tokens: Kubernetes service accounts, CI/CD tokens, PyPI tokens
Crypto Wallets: Move funds immediately if wallet files were on the machine

This is not optional. The credential harvester was thorough.

Step 3: Audit Kubernetes and Remove All Artifacts

# Check for lateral movement pods
kubectl get pods -n kube-system | grep -i "node-setup"
find / -name "sysmon.py" 2>/dev/null
# Full removal
pip uninstall litellm -y && pip cache purge
rm -rf ~/.cache/uv
find $(python -c "import site; print(site.getsitepackages()[0])") \
    -name "litellm_init.pth" -delete
rm -rf ~/.config/sysmon/ ~/.config/systemd/user/sysmon.service
docker build --no-cache -t your-image:clean .

The malware deployed privileged pods into kube-system and installed a persistent backdoor as a systemd service. Both need to be found and removed before you rebuild anything.

Do not downgrade to an older version. Remove entirely and replace.

The Structural Problem with Self-Hosted Python LLM Proxies

I want to step back from the immediate incident response and talk about the architecture decision that made this attack possible.

LiteLLM's Python proxy inherits hundreds of transitive dependencies. ML frameworks, data processing libraries, provider SDKs. Every one of those is a trust decision that most teams make automatically every time they run pip install --upgrade.

When you add LiteLLM to a project, you are not just trusting LiteLLM. You are trusting every package it depends on, every package those packages depend on, and every maintainer account associated with each one. That is hundreds of people and organizations whose security practices you are implicitly vouching for.

The .pth attack vector makes this worse. Most supply chain scanning tools focus on setup.py, init.py, and entry points. The .pth mechanism is a legitimate Python feature for path configuration that has been almost entirely overlooked as an injection vector. Traditional security scanning would not have caught this. Expect to see this technique again.

And here is the timeline problem that matters: LiteLLM maintainers did not rotate their CI/CD credentials for five days after the Trivy disclosure on March 19. If the maintainers of the project could not respond fast enough, most teams downstream had no chance.

This is an inherent risk of the self-hosted model. You inherit every vulnerability, every delayed response, every unpinned dependency in the upstream project.

Dependency Pinning Is Not Enough

This comes up every time there is a supply chain attack, so let me address it directly.

Pinning versions protects against pulling a new malicious version. It does not protect against a compromised maintainer overwriting an existing tag. Hash verification (pip install --hash=sha256:<exact_hash>) is the real control. Most teams skip it because the tooling is inconvenient and it breaks every time a legitimate update ships.

The stronger architectural answer is to remove the dependency entirely. A managed LLM gateway reduces the trust boundary to an API key and a URL instead of a Python environment with hundreds of transitive dependencies. Several options exist now. Future AGI's Prism gateway is one I have been evaluating. It does what LiteLLM did (route requests to 100+ LLM providers through a single API), but as a cloud service with no client-side package to install. The migration is a config change:

Before (LiteLLM):

from litellm import completion
response = completion(model="gpt-5", messages=[{"role": "user", "content": "Hello"}])

After (managed gateway):

from openai import OpenAI
client = OpenAI(base_url="https://gateway.futureagi.com", api_key="sk-prism-your-key")
response = client.chat.completions.create(
    model="gpt-5", messages=[{"role": "user", "content": "Hello"}]
)

Same OpenAI SDK format, same model naming, same response schema. Provider keys sit in the gateway dashboard instead of .env files on developer machines where credential theft malware can reach them.

Other managed gateways like Portkey and cloud-native options from the major providers also remove this dependency risk. The specific product matters less than the architectural decision: stop running a self-hosted Python proxy that inherits the full PyPI dependency tree.

The Compliance Angle

The EU Cyber Resilience Act makes organizations legally responsible for the security of open-source components in their products. SOC 2 Type II audits now scrutinize dependency management practices.

"Install the latest version from PyPI" is no longer an acceptable answer during a controls review. If your product uses LiteLLM and your customers' credentials were exfiltrated, the liability is yours, not the open-source maintainer's.

What These Changes Mean Going Forward

Every team running LLM applications now faces a clear architectural question: own the proxy infrastructure and inherit every supply chain risk that comes with it, or use a managed endpoint and reduce the trust boundary to an API call.

The LiteLLM compromise is not a one-off event. It is the third hit in a coordinated campaign that landed in five days. The dependency trees running through self-hosted Python LLM proxies are deep, the release cadence is fast, and pulling the latest version from PyPI is exactly the behavior attackers exploit.

Rotating credentials and removing the compromised package solves today's problem. Rethinking whether a self-hosted Python proxy belongs in the architecture at all solves the category of problem.

The question is not whether another self-hosted LLM gateway will be targeted. It is whether your architecture limits the damage when it happens.