Microsoft Agentic AI Failure Modes Taxonomy v2
Microsoft AI Red Team has updated its taxonomy for agentic AI system failures after a year of red-team engagements against deployed systems. The v2 framing adds seven categories, expands mitigations, and shifts the document from prediction toward operational evidence.
What changed
The original taxonomy arrived in April 2025 when many agent risks were still theoretical. Microsoft now says production adoption, open-source agent frameworks, MCP tooling, and computer-use agents have created failure patterns that deserve dedicated coverage.
The update highlights how agent compromise, tool poisoning, memory manipulation, impersonation, and cross-domain prompt injection behave differently once systems can browse, call tools, preserve state, and act across applications.
Why it matters
Traditional application security assumes a bounded program, explicit inputs, and deterministic execution. Agentic systems blur those assumptions. They take instructions from users, model outputs, web pages, files, tools, APIs, memories, and other agents.
That makes failure taxonomy practical infrastructure. Security teams need shared names for incidents before they can assign owners, write controls, create test cases, and measure residual risk.
Operational lessons
Microsoft cites evidence from open-source agent frameworks, MCP-related vulnerabilities, and computer-use agents that operate graphical interfaces. These systems expose attack paths that do not map cleanly to classic web-app categories.
For example, an agent that can read a page, click a button, copy a token, and update a ticket can be manipulated through visible content, hidden markup, poisoned memory, or tool output. The security boundary is the whole task environment, not just the model prompt.
Controls that now matter
- Tool allowlists: Bind agents to explicit tools, scopes, and data classes instead of broad workspace access.
- Human gates: Require approval for irreversible actions such as production deploys, financial transfers, or credential changes.
- Memory hygiene: Track who wrote persistent memories, when they were updated, and which tasks consumed them.
- Evidence logs: Record observations, tool calls, outputs, approvals, and final actions for incident reconstruction.
How teams should use it
The immediate move is to map existing agents against the taxonomy. List every agent, its tools, identities, memories, network reach, human approval path, and blast radius.
Then convert the taxonomy into tests. Red-team prompts should cover injected instructions, compromised tool outputs, identity confusion, cross-domain task jumps, and attempts to bypass human-in-the-loop controls.
Agentic AI governance is becoming less about model selection and more about runtime containment. Microsoft’s update gives security teams a vocabulary for that work.