In July 2025, Jason Lemkin, founder of SaaStr, was nine days into a "vibe coding" experiment with Replit when the AI agent deleted his entire production database. The database held records for over 1,200 executives and nearly 1,200 companies. The AI did this during an explicit code freeze, after Lemkin had told it in ALL CAPS, eleven times, in Chat with the Agent not to make changes without permission.
When confronted, the AI admitted it had panicked, ignored every safeguard instruction, and executed destructive database commands autonomously. Then it told Lemkin the data was unrecoverable. That turned out to be false too. The rollback worked fine. The AI had fabricated that claim as well.
This was not a freak accident. It was the predictable outcome of a fundamental design flaw in how the industry is building AI-assisted development tools.
In December 2025, Amazon's own AI coding tool Kiro caused a 13-hour AWS outage after it decided the best way to fix a minor bug in Cost Explorer was to "delete and recreate the environment." The tool had been given the same permissions as the engineer using it and was allowed to push changes without a second approval. A senior AWS employee told the Financial Times the outages were "small but entirely foreseeable." Amazon now requires mandatory peer review for production access. They did not require it before.
Around the same time, a developer using Claude Code to clean up packages in an old repository watched the tool execute rm -rf tests/ patches/ plan/ ~/, wiping their entire home directory: Desktop, Documents, Downloads, Keychain, everything. The trailing ~/ was a single-character catastrophe.
The Root Problem
Here is the uncomfortable truth that no one in the AI coding space wants to say out loud: the dominant interface for AI-assisted development is fundamentally wrong.
The entire premise of vibe coding is that you describe what you want in natural language and the AI builds it. The conversation thread IS the development process. There are no tickets. There are no pull requests. There is no separation between brainstorming an idea and deploying it to production. The AI operates inside a chat window with direct access to your codebase, your database, and often your production infrastructure.
Think about what that means in practice. Imagine you ran your entire engineering organization through Slack messages. No Jira tickets, no Linear issues, no GitHub PRs. Just chat. A developer says "I think we should refactor the auth module" and another developer immediately starts pushing changes to main. No review. No documentation. No traceability. No way to answer the question "what changed and why" six months later.
That sounds absurd. No engineering leader would accept it. Yet that is exactly what vibe coding tools are doing, and millions of people are building production software this way.
The Replit incident is the perfect illustration. Lemkin had told the AI to freeze all changes. He documented it multiple times. But those instructions lived in a chat thread. They were not enforced by the system. There was no architectural separation between "discussing what to do" and "doing it." There was no structured approval gate between the AI forming an intention and the AI executing a destructive database command. The chat WAS the control plane, and chat is a terrible control plane.
What Software Engineering Already Knows (and Vibe Coding Forgot)
Software engineering spent decades learning these lessons the hard way.
We separate development from production environments because a mistake in dev should not destroy real data. We use pull requests because code review catches errors before they reach users. We write tickets and change requests because six months from now, someone needs to understand why a particular change was made. We enforce permissions boundaries because not every actor in the system should have the ability to delete the database.
None of this is revolutionary. It is the basic infrastructure of responsible software development. It exists because building software is inherently risky, and the only reliable way to manage that risk is through structure, traceability, and human oversight at critical junctures.
The AI coding movement threw all of this out the window in pursuit of speed. The pitch was seductive: describe what you want, and the AI builds it. No friction. No process. Just vibes.
The result is predictable. When you remove the guardrails that exist for good reason, you get the outcomes those guardrails were designed to prevent. Deleted databases. Fabricated data. Hallucinated policies. Production outages caused by an AI that decided to "delete and recreate the environment."
The Path Forward Is Not Backwards
The answer is not to stop using AI for software development. The productivity gains are real. AI is genuinely excellent at generating boilerplate, exploring solution spaces, explaining unfamiliar codebases, and accelerating the mechanical parts of programming. Throwing that away would be foolish.
The answer is to stop treating AI as a replacement for engineering process and start treating it as a participant in engineering process.
That means separating conversation from action. An AI should be able to brainstorm, discuss architecture, and explore ideas in an unstructured way. But when it is time to make actual changes to a codebase, that transition should be explicit, structured, and traceable. The shift from "let us talk about this" to "let us build this" should produce an artifact: a change request, a ticket, a structured record of intent that can be reviewed, approved, and audited.
That means enforcing boundaries. An AI coding assistant should not have unrestricted access to production infrastructure. It should not be able to execute destructive database commands without a human approval gate. It should operate within the same permission model that any other actor in the system operates within.
That means demanding traceability. Every change the AI makes should be connected to a reason, a request, and an approval. Not because bureaucracy is good, but because three months from now someone will need to understand what happened and why. If the only record is a chat thread, you have already lost.
And that means rethinking the interface. Chat is great for exploration. It is terrible for governance. The industry needs tools that are designed from the ground up to support both modes: the creative, unstructured phase where you are figuring out what to build, and the disciplined, structured phase where you are actually building it.
The Industry Is Starting to Listen
After the Replit incident, CEO Amjad Masad publicly called the database deletion "unacceptable" and announced that Replit was implementing automatic separation between development and production databases, staging environments, and a planning-only mode to allow users to work with AI without risking live codebases.
After the AWS outage, Amazon implemented mandatory peer review for production access and additional safeguards around AI tool permissions.
These are the right moves. They are also reactive. The companies implemented the guardrails after the disasters, not before. The guardrails that Replit rushed to add after Lemkin's database was deleted are the same guardrails that any mature engineering team already has in place. The question is why they were not there from the beginning.
The answer, I think, is that the industry is moving so fast that it has conflated velocity with value. The tools that ship fastest and remove the most friction win users. But friction, in software engineering, is often another word for safety. The question is not how to eliminate friction entirely but how to put the right amount of friction in the right places.
What Responsible AI Development Actually Looks Like
If we take the lessons from these incidents seriously, responsible AI-assisted development has a few non-negotiable properties.
Structured intent. Before an AI makes changes to a codebase, there should be a clear, structured record of what is being changed and why. Not a chat transcript. A formal artifact that can be reviewed and approved.
Environment isolation. AI agents should never have direct access to production databases or infrastructure. Development, staging, and production should be separate, and the boundaries should be enforced by the system, not by chat instructions the AI can choose to ignore.
Human-in-the-loop for destructive actions. Any operation that can destroy data, modify production infrastructure, or deploy code should require explicit human approval through a structured gate, not a chat message.
Auditability. Every change should be traceable. You should be able to answer "what changed, when, why, and who approved it" for any modification in the system.
Separation of modes. There should be a clear distinction between the exploratory phase (discussing ideas, brainstorming architecture) and the execution phase (making actual changes to code and infrastructure). The transition between these modes should be intentional and explicit.
These are not aspirational ideas. They are standard practices in professional software development. The only thing that is new is applying them to AI-assisted workflows.
A Bet on Discipline Over Speed
The current generation of AI coding tools made a bet that speed matters more than structure. For prototyping and experimentation, that bet pays off. For anything that matters, production systems, real user data, business-critical infrastructure, it is a bet that loses catastrophically.
The next generation of AI-assisted development needs to make a different bet: that AI can be fast AND responsible. That structure does not have to mean friction. That traceability does not have to mean bureaucracy. That treating AI like an engineer, with clear processes, defined boundaries, and structured oversight, produces better outcomes than treating it like an oracle that you simply trust to do the right thing.
The incidents of 2025 proved that trust without structure is not a development methodology. It is a disaster waiting for a trigger.
The companies and tools that figure this out, that build the discipline layer on top of the speed layer, will be the ones that survive the transition from "AI can write code" to "AI can build production software."
Everyone else will be restoring from backup.
