McKinsey's AI Platform Was Breached in Two Hours. Here's What Every Enterprise Should Learn.

AI SecurityLast updated on
McKinsey's AI Platform Was Breached in Two Hours. Here's What Every Enterprise Should Learn.

On February 28, 2026, an autonomous security agent built by CodeWall compromised McKinsey's internal AI platform, Lilli, in under two hours. No credentials. No insider access. Just an unprotected API endpoint and one of the oldest vulnerability classes in computing: SQL injection.

The result was full read-write access to a production database containing 46.5 million chat messages, 728,000 files, 57,000 user accounts, and 3.68 million RAG document chunks representing decades of proprietary McKinsey research and frameworks. The security agent's own assessment upon discovering the scope: "This is devastating."

McKinsey responded quickly, patching the vulnerability within days of responsible disclosure. That response deserves credit. But the breach itself exposes a pattern we see across enterprise AI deployments: organizations invest heavily in building AI capability while underinvesting in the governance infrastructure that secures it.

This is not a McKinsey problem. It is an industry problem.

The Attack Surface Nobody Mapped

Lilli launched in 2023 and grew rapidly. By the time of the breach, over 70% of McKinsey's 43,000+ employees used the platform, processing more than 500,000 prompts per month. The platform hosted 384,000 AI assistants across 94,000 workspaces. It connected to 266,000+ OpenAI vector stores containing 1.1 million files.

Consider the scale of that deployment. Now consider what CodeWall found: 22 of over 200 API endpoints lacked authentication entirely. One of those endpoints accepted search queries where field names were concatenated directly into SQL statements. Values were parameterized, but field names were not.

Standard security tools missed it. OWASP ZAP, one of the most widely used application security scanners, failed to detect the vulnerability. The CodeWall agent found it through fifteen blind iterations of prompt injection-style SQL injection, using error messages to reverse-engineer the query structure and escalate from reconnaissance to full database access.

The lesson here is not that McKinsey's security team failed. It is that traditional application security tools were not designed for the threat surface that AI platforms create. AI systems introduce new categories of assets, new data flows, and new attack vectors that point-in-time scanning tools do not cover.

The Crown Jewel Nobody Protected

The most alarming finding was not the data exposure. It was the write access to system prompts.

Lilli's behavioral instructions, the prompts that governed how the AI responded to 43,000 employees making strategic decisions, were stored in the same database that was compromised. A single SQL UPDATE statement could silently rewrite those instructions without any deployment change, audit log, or integrity check.

The implications are severe. An attacker with this access could poison financial models used in M&A advisory. They could embed instructions to exfiltrate confidential client data through normal-looking AI responses. They could strip safety guardrails entirely, turning a trusted internal tool into a vector for misinformation. This is precisely the kind of silent manipulation that makes current AI guardrails little more than security theater when they lack runtime enforcement.

CodeWall put it bluntly: "AI prompts are the new Crown Jewel assets." We agree. And we would add this: most enterprises treat prompts as configuration, not as critical infrastructure. They store prompts without version control, without access restrictions, without integrity monitoring. They would never treat a database schema or an API key this way. Yet prompts control the behavior of systems that influence strategic decisions across entire organizations.

This is why we built Swept AI's supervision layer to provide continuous prompt integrity monitoring. If Lilli's system prompts had been under this kind of active governance, the unauthorized write operation to the prompts table would have triggered an immediate alert, long before any damage could propagate through the organization.

The Governance Gap

This breach illustrates a gap that we at Swept AI have been documenting since our founding: the gap between AI deployment velocity and AI governance maturity.

Enterprises deploy AI platforms at speed. They build hundreds or thousands of AI assistants, connect them to sensitive data sources, and roll them out to large user populations. The business pressure to do this is real and understandable.

But the governance infrastructure lags behind. Most organizations lack basic visibility into their AI systems. They cannot answer fundamental questions: How many AI assistants exist in our organization? What data can each one access? Who created them? What instructions govern their behavior? Have those instructions changed since deployment?

McKinsey had 384,000 AI assistants. That number alone raises a governance question: did anyone have a complete inventory of what each assistant could do, what data it touched, and what prompts governed its behavior?

This is not a criticism of McKinsey specifically. We see this pattern in nearly every enterprise AI deployment. The tools for building AI systems have outpaced the tools for governing them. Traditional security focuses on network perimeters, authentication, and known vulnerability classes. AI governance requires something different: continuous monitoring of AI behavior, prompt integrity verification, output evaluation, and centralized visibility across an organization's entire AI footprint.

What Continuous AI Monitoring Looks Like

Point-in-time security scans failed here. A vulnerability that persisted through over two years of production operation was invisible to standard tools. This is not a failure of those tools. It is a scope mismatch. They were built for a different class of application.

AI platforms require continuous, behavior-aware monitoring. That means watching not just for traditional vulnerabilities at the perimeter, but for changes in how AI systems behave over time. Specifically, enterprises need:

Prompt integrity monitoring. System prompts should be treated like source code: versioned, access-controlled, and continuously verified against a known-good baseline. Any unauthorized modification should trigger an immediate alert. If McKinsey's prompts had been under integrity monitoring, a write operation to the prompts table would have been detected before any damage occurred.

AI system inventory and visibility. Organizations need a centralized registry of every AI assistant, every data connection, every prompt, and every user interaction pattern. Without this, governance is impossible. You cannot secure what you cannot see. This is exactly what a comprehensive AI evaluation framework provides.

Behavioral drift detection. AI systems change over time, through prompt modifications, data changes, or model updates. Continuous evaluation of AI outputs against established baselines catches drift before it becomes a breach.

Deterministic policy enforcement. Critical boundaries should not depend on probabilistic systems. If an AI assistant should never access certain data, expose certain information, or take certain actions, those constraints belong in code, not in prompts. Building an AI trust layer means enforcing policies deterministically at runtime.

Five Steps Every Enterprise Should Take Now

The McKinsey breach is a signal, not an outlier. If a firm with McKinsey's security resources and budget can have a vulnerability of this magnitude persist undetected for years, the probability that similar vulnerabilities exist in other enterprise AI platforms is high.

Here is what we recommend:

  1. Audit your AI inventory. Catalog every AI assistant, agent, and model deployment in your organization. Document what data each can access, who created it, and what prompts govern its behavior.

  2. Implement prompt integrity monitoring. Treat system prompts as critical assets. Version them, restrict write access, and monitor for unauthorized changes continuously.

  3. Move beyond perimeter security for AI. Traditional vulnerability scanners are necessary but insufficient. Layer in AI-specific monitoring that evaluates behavior, not just infrastructure.

  4. Establish deterministic failsafes in code. Policy boundaries that matter should be enforced in deterministic systems, not in prompt instructions that an attacker or a model update can override.

  5. Build for regulatory readiness. The EU AI Act and emerging frameworks worldwide will require organizations to demonstrate governance over their AI systems. Incidents like this will carry regulatory consequences. Building governance infrastructure now is an investment, not an expense.

The Two-Hour Wake-Up Call

CodeWall's agent needed two hours. That is the window between "everything looks fine" and "full read-write access to the production database." Two hours to compromise a platform used by 70% of one of the world's most sophisticated consulting firms.

The question for every enterprise leader is straightforward: if an autonomous agent targeted your AI platform today, what would it find in two hours?

At Swept AI, we build the trust and governance layer that helps organizations answer that question with confidence. Not after a breach, but before one. Because the gap between deploying AI and securing AI is where the real risk lives. And closing that gap is not optional.

Join our newsletter for AI Insights