Hippo's 273% Loss Ratio: What Happens When AI Can't See Climate

AI GovernanceLast updated on
Hippo's 273% Loss Ratio: What Happens When AI Can't See Climate

Hippo Holdings reported a net loss ratio of 273% on its homeowners book in the first quarter of 2023.1 In plain terms, the company paid out $2.73 in losses for every $1 of earned premium it collected over those three months. Within a few months Hippo had paused new homeowners business nationwide and announced a strategic shift away from the direct-to-consumer DIY model it had built the brand around.2

The number is unusual on its face. A 100% loss ratio means a carrier broke even on losses before expenses, and 150% in a single quarter already triggers regulatory attention. A 273% loss ratio is the arithmetic signature of a rating algorithm and a real-world distribution of risk that have moved out of alignment.

Insurtech framing tends to wall the result off as something specific to Hippo's distribution model or its growth pace. The mechanism that produced the number is more general than that, and every carrier writing property lines with a machine learning rating model trained on historical loss data is going to encounter some version of it. We think it is worth walking through the chain, because the response inside the company and the response inside the reinsurance market both depend on whether the board understood the mechanism in advance.

What Actually Happened at Hippo

Hippo's underwriting was built around a thesis that sensors, smart-home data, and modern data science could price homeowners more accurately than the legacy market. The company grew aggressively in geographies its model assessed as favorable. The model's view of "favorable" was anchored in historical loss data. Most loss data sets used in property pricing extend back fifteen to twenty years and weight more recent experience more heavily, but the most recent experience was already lagging the underlying climate.

The 2022 hurricane season delivered Ian. The 2023 convective storm season delivered the most expensive year on record for severe convective storms in the United States.3 Hippo's book had concentrations in territories the model had read as low-tail. The actual tail in those territories was being shaped by a climate the training data had not yet absorbed, which produced losses on a frequency and severity the rating algorithm had not priced for.

The 273% quarter was the arithmetic result. The strategic response, which included the multi-state pause and the pivot toward an agency distribution model, was the operating acknowledgment that the original underwriting thesis had a gap the company could not close fast enough at the existing pace of new business.

The Training-Data Lag Problem

There is a concept worth naming clearly. Every machine learning model that prices catastrophe-exposed property is operating against a version of the climate that is some number of years old. The model can only see what is in the training set. The training set is, almost by definition, a record of a different decade.

Aon estimated that secondary perils, severe convective storms, wildfire, flood, and similar non-tropical-cyclone events, accounted for 86% of global insured losses in 2023.4 The historical mix was much more weighted toward the named perils, with secondary perils sitting in a smaller share. A pricing model trained on a decade-long window that ended in 2018 or 2020 was sized for a different distribution. The losses are now arriving in categories the model was not built to price.

The wildfire example is particularly clean. Elementum Advisors refit its wildfire model after analyzing roughly 2 million U.S. wildfires, with the firm's head of data and analytics explicitly stating that the prior model had been "benchmarked to historical trends and not to today's climate."5 The prior model worked perfectly well as an answer to a question about a previous climate that the underlying physical system had since moved past. The same shape of staleness sits inside every ML rating model that uses pre-2020 loss experience as anchor data.

The Hippo number is what training-data lag looks like when it expresses itself through a quarterly P&L. Most carriers will see a less dramatic version, but the same direction of error: claims arriving from territories and perils the model under-priced, because the model could not see the climate it was actually pricing into.

The Reinsurance Conversation That Follows

A 273% loss ratio is a P&L event that immediately produces a follow-on conversation at the reinsurance treaty. A reinsurer reviewing a cedent's catastrophe placement after a quarter like that one is going to ask three questions. They are predictable, they are standard, and they are the questions every CEO whose carrier uses an ML rating model should be able to answer before the meeting starts.

The first question is what changed in the model since the prior placement. A treaty-grade answer covers retraining cadence, feature changes, weight shifts in territories that produced the loss, and any structural changes to how the model handles secondary perils. Pointing to a data scientist's commit history does not meet that bar.

Aggregation comes next. The reinsurer wants to know what total insured value the cedent is carrying in the zip codes that produced the loss, how the model's risk weights ranked those zip codes prior to the loss, and whether the cedent's exposure management framework can demonstrate that the concentration matches the priced-for assumption. The closer the actual aggregation is to the model's implied aggregation, the easier the conversation becomes.

Then the forward-looking question: what is the cedent doing differently this year. The reinsurer is forming an opinion about whether the loss was a one-quarter event or a structural underwriting issue that will repeat. The cedent's answer needs to cover both the technical fix to the model and the governance change that makes the technical fix durable. A purely technical response without governance evidence reads as a cedent that will encounter the same surprise again.

A cedent that cannot answer these three questions is exposed. The treaty wordings give the reinsurer rights to non-renew, reprice, or in some cases rescind based on undisclosed changes in underwriting practice or philosophy. We covered the wording exposure in detail in our analysis of why the reinsurance treaty is where AI risk becomes existential, and the same six-word hook applies here: change in underwriting practices or philosophy. A 273% loss quarter is the kind of result that prompts the reinsurer to read the treaty carefully and ask whether the carrier disclosed the model retraining decisions that produced the geographic concentration in the first place.

The Concentration Question Is Older Than AI

The historical context matters because the failure mode is older than the technology. We wrote about Merced Property and Casualty and the Camp Fire as the cleanest pre-AI example: 113 years of underwriting cleared in 25 days because the cat exposure exceeded the reinsurance retention and the surplus combined. What destroyed the carrier was a concentration the underwriting framework had not adequately recognized, against an event larger than the program was structured to absorb.

The AI version of that story does not change the underlying mechanism. It changes who is making the concentration decision and how visible the decision is to the people who need to see it. Where a traditional carrier built a dangerous Florida book through underwriting committee decisions, agency appointments, and territorial expansion plans that left a paper trail, an ML rating model that subtly upweights certain zip codes during a retraining cycle can grow the same concentration without producing a single document a treaty underwriter could read.

That is the governance gap the Hippo result exposes. The model produced an outcome the company did not plan for, in territories the company did not deliberately concentrate in, at a velocity the existing underwriting framework was not designed to detect. The 273% number is the visible artifact, but the more consequential one sits underneath it: there is no documented chain showing how the model's territorial weights drifted between renewals and what the company communicated to its reinsurer about the drift.

The CEO Question

Hippo's quarter is useful as a case study because it isolates the climate-blindness mechanism from the questions of distribution model, underwriting talent, or company age. Hippo had the most modern data infrastructure in the industry. The result still arrived. The mechanism is independent of the carrier's sophistication. It depends only on whether the rating model was anchored in a climate that no longer matches the climate it is pricing into.

The question for the CEO of any carrier using ML in homeowners or commercial property rating is short. When did our model last see today's climate, and who validated it?

Most carriers cannot name the validator or the date. The usual answer is some version of "the model is retrained quarterly on rolling claims experience," which answers a different question. Retraining on rolling claims experience captures losses the carrier has already paid, so it is mostly a record of where the climate already was when the loss occurred. It does not incorporate forward-looking climate science, updated peril modeling from the catastrophe modeling vendors, or the structural shift in secondary peril frequency that Aon and the rest of the reinsurance market have been reporting.

The validators who matter for the question are external: the catastrophe modelers, the climate science integration vendors, and increasingly the reinsurer itself, which is forming an opinion about the cedent's model whether the cedent participates in the conversation or not. A carrier that cannot name a recent external validation and cannot point to a documented model change log written for a treaty audience is operating on the same set of assumptions that produced Hippo's quarter. The size of the surprise depends on how much the underlying climate has moved since the model was last calibrated, and on the cedent's geographic concentration when the surprise arrives.

The 273% number expressed at one carrier what the underlying mechanism can express at any of them. Boards that want to be ready for the conversation with their reinsurer should ask the validation question now, document the answer in language a treaty underwriter can read, and treat any retraining cycle that materially shifts territorial weights as a disclosable event. The alternative is finding out what the answer was at the same time the reinsurer does, after the loss has already developed.

Footnotes

  1. "Net loss ratio, the ratio of the net losses and loss adjustment expenses to the net earned premium, was 273%, which was 23 points higher compared to Q1 2022."Coverager: Hippo's Q1 2023 results (May 2023)

  2. "I don't know about you but when I have a leaky hose, I generally turn the hose off at the house before I try to plug the leaks. And that's exactly what we did… We're going to start turning the spigot back on in a very selective way in areas in which we think that we are priced adequately."Carrier Management: After Countrywide Pause, InsurTech Hippo Slowly Reemerging Next Week (September 8, 2023)

  3. "For the first time, insured losses from severe convective storms (SCS) in the U.S. surpassed $50 billion and accounted for 60% of global insured losses."Aon: Record $50bn U.S. Severe Convective Storm Losses Drive Total Natural Catastrophe Toll in 2023 (October 19, 2023)

  4. "Last year, the hottest on record, secondary perils accounted for 86% of global insurance losses, according to insurance broker Aon Plc."Insurance Journal: Catastrophe Bonds Use Models Underestimating Climate Risks, Investors Say (May 13, 2024)

  5. "After analyzing data from almost two million US wildfires, Elementum saw a 'statistically significant, higher frequency of areas that were burned in northern California' than the model indicated, Weber said… It was 'benchmarked to historical trends and not to today's climate,' said Jake Weber, Elementum's head of data and analytics."Insurance Journal: Catastrophe Bonds Use Models Underestimating Climate Risks, Investors Say (May 13, 2024)

Join our newsletter for AI Insights