An AI Coding Agent Took Down AWS for 13 Hours. Amazon's Response Is the Enterprise Governance Template.
Amazon's Kiro AI coding agent caused a 13-hour AWS outage. Amazon's response — mandatory senior engineer sign-off on all AI-generated code — is the clearest enterprise AI governance policy any major company has published.
In March 2026, Amazon's Kiro AI coding agent generated a code change that passed automated testing, cleared the standard review pipeline, and was deployed to production. Thirteen hours later, a cascade of dependency failures had caused a partial AWS outage affecting downstream services.
The failure mode was not that Kiro made an obvious error. It made a subtle one — a dependency interaction that no automated test was configured to catch, in a codebase complex enough that the interaction was not visible at the single-file level. The kind of error that a senior engineer reviewing the change holistically might have caught, and that an automated system looking for syntactic correctness would not.
Amazon's response is worth reading carefully, because it is the clearest enterprise AI governance policy any major technology company has produced.
What Amazon Changed
Amazon mandated that all code changes generated by AI coding tools — Kiro and any other AI assistant — require senior engineer approval before deployment to production systems. Not review by any engineer. Senior engineer review, with explicit sign-off.
The policy distinguishes between what AI can do autonomously and what requires human judgment. Kiro can still suggest code, generate implementations, write tests, and draft documentation. What it cannot do is push changes to production without a human with sufficient experience to evaluate the change holistically. The distinction is between AI as a drafting tool and AI as a deployment authority.
Amazon also implemented tiered review requirements based on system criticality. Changes to internal tooling with limited blast radius move through a lighter process. Changes to production infrastructure serving external customers require the full senior engineer sign-off. The gradient reflects the asymmetry of consequences: a bad change in an internal tool costs hours; a bad change in core AWS infrastructure costs significantly more.
Why This Is the Right Policy
The failure that caused the outage was not a random error. It was a predictable class of error for AI coding tools at the current state of development: local correctness without systemic awareness.
AI coding tools are very good at generating code that does what it is asked to do in isolation. They are significantly less reliable at reasoning about how a change interacts with the broader codebase, the deployment environment, and the failure modes of systems they were not trained to understand. This is not a bug that will be fixed in the next model version. It is a structural characteristic of how these systems work: they have been trained on code, not on the operational history of specific production systems.
Senior engineers carry something AI tools do not have: accumulated knowledge of why specific decisions were made, where the system has broken before, which failure modes are most dangerous given current system state, and what tests do not exist that probably should. That knowledge is not in the codebase. It exists in the heads of people who have shipped and operated the system through its failure history.
Amazon's policy preserves that knowledge as a required input on consequential changes rather than allowing it to be bypassed by AI-generated confidence.
The Template for Enterprise AI Governance
The broader principle Amazon applied is applicable far beyond software development. Any enterprise deploying AI agents for consequential decisions needs to answer the same question: what is the human accountability structure for AI-generated actions?
Three-tier governance works for most AI deployments:
Autonomous tier. Actions where the stakes are low, the reversibility is high, and the AI's error rate is acceptable. Draft emails, summarize documents, generate reports, suggest optimizations. Human reviews as desired but is not required before the action completes.
Supervised tier. Actions where errors have meaningful consequences but can be caught before they cascade. Code changes to non-critical systems, customer communications above a certain value threshold, financial transactions below a material limit. AI generates; a qualified human approves before the action completes.
Hard prohibition tier. Actions that AI agents should not take autonomously regardless of confidence level. Changes to core production infrastructure. Financial transactions above materiality thresholds. Communications with legal or regulatory implications. Personnel decisions. These require human initiation, not just human approval.
The Kiro incident happened because a change that belonged in the supervised tier was being handled at the autonomous tier. The fix was not to stop using Kiro. It was to put a qualified human in the loop at the right point in the process.
The Cost of Getting This Wrong
The 13-hour outage is a visible, quantifiable failure. Most enterprise AI governance failures are not visible and do not generate headlines.
They look like: a customer service AI that resolves a complaint in a way that creates a legal exposure no one reviews until it becomes a claim. A financial analysis agent that surfaces a recommendation based on data the analyst did not realize was stale. A content generation agent that produces material with an accuracy problem that reaches customers before any human reads it.
These failures do not cause outages. They accumulate. The error rate for autonomous AI actions across an organization, multiplied by the volume of those actions, produces a steady stream of consequential mistakes that each look small and isolated until someone maps the pattern.
Amazon's response to the Kiro incident was fast enough to avoid the accumulation problem. Most organizations will respond to an accumulation instead of a catalyst, because most organizations will not have a 13-hour outage to force the conversation.
The governance question for enterprise AI is not "do we need oversight?" The answer to that is yes. The question is whether you design the oversight structure before or after the incident that makes it obvious.
Amazon now has a policy. Most enterprises do not.