Pull a vendor AI master services agreement signed before the NAIC Model Bulletin on Use of Artificial Intelligence Systems by Insurers was adopted, and read it from a market conduct examiner's perspective. The contract probably gives the vendor broad latitude to update the model without notice. It probably treats training data as the vendor's confidential property. It probably caps liability at twelve months of fees. It probably says nothing about bias testing artifacts, drift notifications, or sub-processor disclosure.
Now read the bulletin's Section 4 obligations against that contract. The carrier owes regulators evidence that it understood the model before deployment, that it monitors the model in production, that it can produce the documentation supporting both. None of those obligations transfer to the vendor under the bulletin. They sit with the carrier, regardless of whether the carrier has the contractual rights to do the work.
That is the gap the NAIC AI Systems Evaluation Tool pilot keeps surfacing. Carriers receive an evaluation request, ask their vendor for a model card or a bias testing summary, and discover that the contract did not require the vendor to produce one. The regulator does not accept "the vendor would not share it" as a response.
The fix is contract retrofits at renewal. Not a wholesale rewrite, which most counterparties will resist. A targeted set of eight clauses that close the documented gaps regulators have surfaced in pilot reviews and bulletin examinations. Below is what each clause should accomplish, where carriers tend to leave money on the table, and how the clauses interact with the NAIC's pending vendor registry framework.
The Eight Clauses
1. Audit Rights
The bulletin assumes the carrier can validate vendor claims. Most pre-2024 contracts give the carrier no inspection right beyond a SOC 2 report. That is insufficient. The audit clause should permit the carrier (or its designated third party under NDA) to inspect, on reasonable notice and at reasonable cadence, the vendor's documentation related to model development, training data provenance, testing results, and incident logs for the model the carrier uses. The clause should explicitly allow the audit to be triggered by a regulatory inquiry, not only by an annual schedule.
The negotiating compromise vendors typically offer is "annual SOC 2 plus model card on request." That is not enough when an examiner asks for evidence of how a specific model decision was reached. Hold the line on the right to inspect underlying documentation, even if the cadence is constrained.
2. Model Card Disclosure
A model card documents architecture, training data sources, intended use, known limitations, and version history. The bulletin treats this as foundational documentation; the pending vendor registry will require vendors to file something close to it with regulators directly. Your contract should require the vendor to deliver a current model card on signing, on every model version change, and on regulator request. The carrier needs the model card before the regulator does, every time. Otherwise the carrier learns about a vendor's own characterization of the model from the registry, which is the wrong order.
Specify what the model card must include at minimum: training data date range, test data composition, validation methodology, fairness testing protocol, known failure modes, and supported and unsupported use cases. Vendors who provide vague model cards in response to vague clauses are common. Specificity in the contract produces specificity in the artifact.
3. Drift-Notification SLA
Models drift. The vendor will notice drift in its own monitoring before the carrier notices it in production outcomes. The contract should obligate the vendor to notify the carrier within a defined window (typically 72 hours) when the vendor's monitoring detects performance degradation, distribution shift, or fairness metric deterioration above defined thresholds. The clause should distinguish between informational drift (logged, summarized in quarterly reports) and material drift (notified within the SLA).
The hardest part of negotiating this clause is defining material drift in advance. A starting point: any change of more than 5% in headline accuracy, any disparate impact ratio deterioration that crosses the four-fifths rule threshold, any spike in flagged outputs above the historical baseline. Get the definition into the contract; do not leave it for vendor interpretation.
4. Training-Data Attestation
Training data quality is a regulatory concern. The bulletin requires carriers to assess data quality and lineage; the NAIC working group's spring 2026 session confirmed training data sources will be a registry disclosure category. Your contract should obligate the vendor to attest, in writing and with reasonable detail, to the sources, date ranges, and licensing status of the training data used for the model the carrier deploys.
Vendors push back on this because they want training data to remain trade-secret protected. The compromise that usually works: categorical disclosure rather than record-level disclosure. The vendor identifies the source categories (public claims data, licensed industry data, synthetic augmentation, customer feedback) and confirms the licensing posture of each, without exposing specific datasets to competitors.
5. Sub-Processor Disclosure
A model often involves more parties than the named vendor. A foundation model API, a hosting provider, a data labeling service, a fine-tuning partner. Each is a sub-processor with access to data or influence over outputs. The carrier's diligence file is incomplete without knowing the chain.
The contract should require the vendor to maintain a current list of sub-processors that handle data covered by the agreement or contribute materially to model behavior, and to notify the carrier of changes to that list within a defined window. Pair this clause with a right to object to new sub-processors that present unacceptable risk, even if the right is rarely exercised. The notification creates the diligence trail.
6. Fairness-Test Sharing
Bias testing is the area where pre-bulletin contracts are weakest. Most are silent. The carrier needs the right to receive the vendor's fairness testing results on a defined cadence (typically quarterly or on every material model change), specified for the protected classes that apply in the carrier's operating states. For carriers with substantial Colorado, Connecticut, or New York exposure, the protected-class list is non-negotiable.
The clause should also reserve the right to request that the vendor run a fairness test on a carrier-specified test set, the carrier covering the cost. This matters because the vendor's general fairness testing may not reflect the carrier's actual book of business. A vendor that cannot or will not run a custom fairness test against the carrier's portfolio is a vendor whose testing the carrier cannot validate.
7. Exam-Cooperation
When a regulator opens a market conduct exam involving a vendor's model, the carrier needs the vendor's cooperation, fast. Most pre-bulletin contracts do not address regulatory inquiries. Some affirmatively limit the vendor's obligation to respond to anything other than a subpoena directly to the vendor.
The clause should obligate the vendor to cooperate with regulator inquiries reaching the carrier about the vendor's model, including providing technical witnesses, documentation, and explanations within timeframes the carrier specifies based on the regulator's deadline. The vendor bears its own cost up to a defined cap; the carrier covers cooperation effort beyond the cap. Without this clause, examiners' deadlines collide with vendors' contractual silence and the carrier eats the resulting friction.
8. Exit and Escrow
The hardest thing to negotiate, and the most important if a vendor is acquired, becomes insolvent, or stops supporting the model. The clause should provide for a defined transition period at the carrier's election, continued license to use the most recent model version during transition, and either source-code escrow or a documented transition data package (model weights, inference code, supporting documentation) deliverable to the carrier or a successor vendor. For a model deeply embedded in pricing or claims operations, the absence of this clause is an existential risk dressed as a contractual gap.
How to Retrofit at Renewal
A contract amendment cycle that tries to introduce all eight clauses at once will stall. The faster path is a tiered approach. First, identify every AI vendor whose contract is up for renewal in the next twelve months. Map the renewal calendar against the registry framework's expected effective date. Vendors renewing before the framework lands have less leverage; vendors renewing after will be more receptive because the registry creates external pressure they cannot avoid.
Second, prioritize clauses by regulatory exposure. Audit rights, model card disclosure, and fairness-test sharing produce the documentation regulators ask for first. Exam-cooperation is the operational lifeline when an inquiry actually arrives. Drift-notification, training-data attestation, sub-processor disclosure, and exit/escrow can follow on a longer cycle if needed.
Third, treat the renewal as a procurement evaluation, not just a redlining exercise. The vendor's willingness to accept the clauses is a signal about how the vendor will behave under regulatory scrutiny. A vendor that rejects audit rights or fairness-test sharing as a matter of policy is a vendor whose model belongs in the carrier's model inventory with a flag for replacement, not renewal.
For carriers building this discipline systematically, Swept AI's evaluation infrastructure connects the contractual rights to the operational evidence: ingesting vendor model cards, running independent bias tests against carrier-specific portfolios, and producing the audit-ready file that bridges procurement, deployment, and ongoing supervision.
What Registry Compliance Does Not Relieve
Some carriers have asked whether the pending vendor registry will reduce the contract burden. It will not. The registry creates regulator-side visibility into vendor disclosures. The carrier's diligence file under the bulletin is a separate obligation, and the carrier needs the contractual rights to assemble that file independently of what the vendor files with the registry.
A registered vendor whose registry filing says one thing and whose contract obligates the vendor to share something else will get the carrier in trouble. The two artifacts must reconcile, and reconciling them requires the carrier to have the contractual rights to see what the vendor told the regulator and validate it against what the vendor told the carrier. Carriers that wait for the registry to land before doing the contract work will discover they have neither leverage nor time.
The bulletin shifted the standard of proof for AI in insurance. The registry is about to formalize the disclosure infrastructure. The contracts have to do the work in between, because no amount of regulatory disclosure relieves the carrier of the obligation to know what its vendors are doing. Eight clauses is not many. Most pre-bulletin MSAs have none of them. The renewal cycle starting now is when that gap closes.
