The NAIC AI Bulletin Grew Teeth. Here's What Insurers Need to Build.

AI GovernanceLast updated on
The NAIC AI Bulletin Grew Teeth. Here's What Insurers Need to Build.

In 2023, the NAIC issued an AI model bulletin built on principles: fairness, transparency, accountability, risk management. For an industry where most AI systems were still in pilot programs, that level of abstraction was appropriate.

Since then, the NAIC has stood up a dedicated AI working group, started developing an AI systems evaluation tool, and state departments have begun writing those principles into examination procedures. Regulators moved from asking whether carriers have AI policies to asking carriers to produce the evidence those policies were supposed to generate.

From Principles to Procedures

The original bulletin was deliberately high-level. It articulated how existing regulatory obligations apply to AI systems: unfair discrimination prohibitions, rate adequacy standards, market conduct requirements. It did not prescribe specific controls or documentation.

Then AI went from pilot programs to production infrastructure faster than anyone expected. Carriers now run AI across underwriting, claims processing, pricing, fraud detection, and customer engagement. What was a competitive differentiator three years ago is an operational baseline today.

Adoption outpaced governance. Carriers could describe their AI policies in general terms, but when regulators asked for documentation showing how those policies translated into operational controls, the response was often silence or a hastily assembled binder. The distance between "we have an AI governance policy" and "here is evidence that policy produced specific controls, monitoring, and remediation actions" turned out to be significant.

The NAIC's response has been institutional: a working group coordinating regulatory approaches across states, an evaluation tool giving examiners a standardized assessment framework, and examination procedures that ask for evidence rather than statements. This is examination infrastructure, already under construction.

Cybersecurity as the Regulatory Lens

Regulators are now evaluating AI governance through a cybersecurity and operational risk lens, and for carriers, this reframes what "AI governance readiness" actually means.

Cybersecurity examinations follow established methodology that regulators have refined over decades: data protection controls, access management, change management, incident response, vendor oversight. They are applying those same categories to AI systems. A carrier that already runs a mature cybersecurity program has most of the organizational muscle required. Applying data classification, access controls, change documentation, and incident response procedures to AI systems is an extension of existing capabilities, not a new discipline to invent from scratch.

Carriers without strong cybersecurity foundations face a harder problem. They need to build both sets of capabilities at the same time, under increasing examination pressure. And regulators are already drawing the connection: a carrier that cannot demonstrate cybersecurity maturity is unlikely to convince an examiner that its AI governance is credible.

Five Areas Regulators Will Examine

The NAIC's evolving expectations cluster around five operational areas.

Data security and integrity goes beyond protecting personally identifiable information. Regulators expect carriers to protect the data that trains and informs AI decisions throughout its lifecycle: encryption, access restrictions, data classification, integrity checks applied to training data, input data, model parameters, and outputs. When training data is corrupted or unrepresentative, the models it produces inherit those problems. Regulators are beginning to treat data integrity failures as governance failures, not technical bugs.

Access and change management. Consider a data scientist pushing a model update to production on a Friday afternoon. Who authorized the change? What was modified? How was it tested? What happens if the update produces unexpected results on Monday? Regulators want documented answers to every one of those questions. Role-based access controls, segregation of duties, formal approval workflows, version control, and rollback procedures are the baseline.

Traceability and auditability is where many carriers discover their biggest gap. For any AI-driven outcome, regulators expect end-to-end lineage: which data inputs, which model version, which human decisions influenced the result. When a policyholder's claim is denied or their premium increases, an examiner will not accept a general explanation of how the model works. The examiner will ask about a specific decision for a specific policyholder and expect the carrier to reconstruct how that outcome was produced.

Third-party and vendor risk management catches carriers who assume their vendor handles governance. It doesn't work that way. The regulatory obligation stays with the carrier regardless of who built or operates the AI system. Vendor contracts must explicitly address AI governance obligations, cybersecurity controls, incident notification, documentation access, and audit rights. Examiners will expect carriers to demonstrate that vendor oversight extends to AI-specific risks, not just general IT vendor management.

AI-specific incident response requires expanding traditional IR programs beyond breaches and outages. AI systems fail differently: model degradation, biased outcomes, unintended use, corrupted training data. A model producing discriminatory outcomes isn't "down" in any traditional sense. It's operating, generating results, and accumulating regulatory exposure with every decision it makes. Carriers need incident definitions, escalation procedures, and response playbooks that cover these failure modes specifically.

The Evidence Gap

Most carriers have AI policies. Large carriers typically have documented principles, risk assessment frameworks, and oversight committees. That is necessary work, but an examiner asking for timestamped audit trails, continuous monitoring records, bias testing results with documented cadence, and incident logs with remediation actions needs more than a policy binder.

We see this consistently across carriers preparing for examination. The ones who built governance as an operational capability, with evaluation frameworks and supervision platforms generating evidence continuously, can produce what examiners ask for on demand. The ones who treated governance as a documentation exercise learn about their gaps at the worst possible time: during the examination itself.

Reconstructed documentation is weaker than it looks. It lacks temporal evidence, so an examiner sees a single snapshot assembled under pressure, with no proof that governance was operating between examinations. We've written about this dynamic in depth in our analysis of what regulators are already asking.

Building for the Examination That's Coming

The NAIC bulletin started as guidance. The working group, the evaluation tool, and state-level adoption of examination procedures are turning it into examination criteria. Regulators will assess AI governance with the same rigor they apply to cybersecurity and financial controls.

The work is concrete: build AI system inventories, implement access and change controls, establish traceability from input to decision, extend vendor oversight to AI-specific risks, expand incident response programs to cover model failures. Carriers operating in states with additional AI legislation, like Colorado's SB21-169, face overlapping requirements that compound the urgency.

If your AI governance program can describe your policies but cannot produce evidence that those policies generated specific controls, that gap is already visible to regulators. Building the infrastructure that produces compliance evidence as a byproduct of operations is more efficient than assembling it under examination pressure, and considerably more convincing to an examiner who has seen the difference.

Join our newsletter for AI Insights