If you cannot list every model that touches a policyholder by Friday, the registry will list them for you.
That is the operating assumption behind the NAIC AI Systems Evaluation Tool pilot, which has begun issuing data requests to selected carriers. The first request in nearly every pilot engagement is the same: produce a complete inventory of AI and predictive models in use across underwriting, pricing, claims, fraud, and customer service. The Foley & Lardner analysis of what to do when an NAIC AI Systems Evaluation Tool pilot request arrives confirms that the inventory request is the gating artifact for the rest of the engagement. Without a credible inventory, the carrier cannot answer any of the follow-on questions about governance, validation, monitoring, or bias testing.
Most carriers do not have an inventory that survives a twenty-question follow-up. They have a spreadsheet maintained by data science, a separate spreadsheet maintained by procurement that lists vendor tools, a third list embedded in actuarial filings, and a fourth that lives in the head of whoever was last asked. None of these is the artifact the regulator is asking for.
The model inventory the NAIC pilot expects is a single, governed system of record. It captures every model in production scope, including vendor-supplied models the carrier embeds in its decisioning. It is current, it is queryable, and it is the source of truth for every downstream regulatory artifact: rate filings, market conduct exam responses, the audit trail's keyspace, and the bias-testing program's coverage list.
The Minimum Schema
The inventory schema below is the minimum required to satisfy a serious regulatory inquiry. Each field corresponds to a question examiners ask and a control that downstream artifacts depend on. The schema also aligns with the NIST AI Risk Management Framework's MAP function, which calls for organizations to characterize the AI systems they operate before they can govern them.
Model ID. A stable identifier that persists across versions and serves as the foreign key for every downstream artifact: rate filings, audit logs, monitoring dashboards, vendor contracts. Carriers that use the data science notebook filename as the model ID rebuild the inventory every six months when the notebook is renamed.
Owner. A named individual, with backup, accountable for the model's continued fitness for purpose. Not a team, not a department, not a generic mailbox. When the model produces an unexpected output and the regulator calls, the owner is the person who answers.
Purpose. A one-sentence statement of what the model decides and what action follows from its output. "Predicts the probability that an FNOL submission contains material misrepresentation, used to route the claim to special investigations review at scores above 0.7." Vague purpose statements like "fraud detection" do not survive examiner scrutiny.
Training data lineage. Source systems, extraction date range, sample construction rules, and the volume of records at each stage of preparation. Sufficient detail that a validator could reconstruct the modeling sample. The NAIC Model Bulletin on the Use of Algorithms, Predictive Models, and AI Systems by Insurers names data lineage as a foundational governance requirement.
Deployment scope. The states, lines of business, customer segments, and decision contexts in which the model is in production. A model approved for personal auto rating in twelve states should not be quietly running on commercial property submissions because someone reused the API endpoint.
Version. The current production version, with a hash that ties to the underlying artifacts (training code, training data snapshot, hyperparameters, evaluation results). Without a version hash, "model X" can mean three different things to three different teams.
Last validation date. The date of the most recent independent validation, the validator's name, and a pointer to the validation report. NAIC examiners ask for the validation report by name once they see the inventory.
Drift status. A current indicator from the monitoring system, refreshed at a defined cadence. Green if performance is within tolerance bands. Yellow if a metric has crossed a warning threshold. Red if a remediation workflow is open. The inventory is not a static document. It is a live status board.
Vendor flag. A boolean indicating whether the model is built in-house or supplied by a third party, with a link to the vendor contract and the vendor's documentation. Vendor models carry a different governance burden, and the inventory must surface that distinction at a glance.
A nine-field schema is not a spreadsheet, even though it can be represented as one. It is the minimum data contract between the carrier's AI program and the regulators who will examine it.
Inventory-to-Control Mapping
The reason the inventory matters more than any other governance artifact is that every other artifact depends on it. A single, credible inventory feeds the rest of the regulatory program without duplication or reconciliation work.
NAIC AI Systems Evaluation Tool pilot responses begin with the inventory and reference it throughout. When the pilot questionnaire asks how the carrier identifies AI systems for governance scope, the answer is the inventory. When it asks how the carrier ensures vendor models receive equivalent oversight to in-house models, the answer is the vendor flag in the inventory and the governance workflow it triggers. When it asks how the carrier monitors for drift, the answer is the drift status field and the underlying monitoring system that updates it.
Consumer data rights frameworks at the state level introduce a second consumer of the inventory. When a consumer requests information about the data and decisions made about them, the carrier must be able to identify which models contributed to which decisions. That identification runs through the inventory. A carrier without a model ID for every model in scope cannot answer a consumer data request without a manual investigation that takes weeks.
State rate filings under the NAIC Model Bulletin reference the inventory implicitly: every model named in a rate filing must have a corresponding inventory entry, and every inventory entry that affects rates must be reconcilable to a filed factor. Carriers that maintain the inventory as the source of truth for both rate filings and pilot responses eliminate the reconciliation step that consumes weeks of senior actuarial time during exam season.
The inventory is also the keyspace for the audit trail every insurance carrier now needs to maintain. Every audit log entry references a model ID and version. The audit trail is not queryable without the inventory, and the inventory is not actionable without the audit trail. They are two halves of the same compliance system.
Vendor management is the fourth consumer. A vendor model with a flag in the inventory must have a corresponding contract that addresses the regulatory requirements the carrier has assumed by deploying it. The specific clauses that contract must contain are detailed in our companion piece on vendor AI contract clauses for market conduct exams.
A single inventory feeds four downstream programs. A carrier without one is rebuilding parts of it every time a regulator, consumer, or vendor manager asks a question.
Inventory Governance
A schema is a starting point. Without governance, the inventory drifts out of date within a quarter, and the carrier finds itself in the same position it started in: uncertain which models are in production, who owns them, and what state they are in.
Governance for the inventory has four components, and each addresses a specific failure mode that has surfaced in carriers that built the inventory once and walked away.
The first is ownership. A named individual at the director level or above is accountable for the inventory's accuracy. That accountability is documented in the AI governance charter. When an examiner finds a discrepancy between the inventory and the deployed reality, the accountable owner is the person who explains why.
The second is refresh cadence. The inventory must be refreshed on a defined schedule, not on demand. Monthly is the minimum for an active AI program. Quarterly is acceptable only for portfolios of fewer than ten models. The refresh process verifies every field for every entry: that the owner is still in the role, that the deployment scope still matches production configuration, that the version field matches the version actually serving traffic, that the drift status reflects the current monitoring output.
The third is change triggers. Certain events require an immediate inventory update outside the normal cadence: a new model deployment, a model retirement, a version change that materially affects behavior, a change in deployment scope, a change in vendor relationship, a discovered material discrepancy. The triggers are documented, the responsible parties are named, and the update path is wired into the deployment and procurement workflows so that an inventory update is a required step in those processes rather than a manual follow-up.
The fourth is independent verification. At least annually, a function independent of the model owners verifies the inventory against the deployed reality. That verification samples production traffic, identifies the models actually serving requests, and reconciles them against the inventory. Discrepancies surface gaps in the change-trigger workflow and drive remediation. Carriers that skip independent verification report inventories that look complete and are missing models in production.
These four components are the difference between an inventory that is the source of truth and an inventory that is a museum exhibit.
What the Inventory Does Not Solve
The inventory is the foundation of the AI governance program. It is not the program itself.
A carrier with a perfect inventory still needs validation, monitoring, bias testing, audit trails, vendor management, and an explanation artifact for each rate filing. The inventory tells the carrier and the regulator what models exist. It does not, on its own, tell either party that those models are governed appropriately. What the inventory does is make every other governance activity possible. Without it, the carrier cannot define the scope of validation, cannot prioritize monitoring investment, cannot demonstrate bias-testing coverage, and cannot produce a credible response to any regulatory inquiry.
Certification infrastructure treats the inventory as the spine of the governance program. Every validation report, every monitoring alert, every audit log, every rate filing, and every vendor contract references the inventory by model ID and version. The inventory is generated from the deployment system rather than maintained by hand, which closes the gap between what the carrier believes is in production and what is actually serving traffic.
The carriers that produce a usable inventory in response to the first NAIC pilot request are not the ones with the most sophisticated AI capabilities. They are the ones that built the inventory before the request arrived. The ones that wait for the request to start the work spend the next ninety days building the inventory under examination pressure, while their competitors spend that time answering the substantive questions the inventory was supposed to enable.
The inventory is the artifact that determines whether a carrier engages with the regulator on the merits of its AI governance program or on the basic question of whether it has one.
