Shapley Values Explained for AI Practitioners

Shapley values represent an elegant attribution method from Cooperative Game Theory dating back to 1953. Developed by Lloyd Shapley, who later won the Nobel Prize in Economics, this approach has become popular for explaining machine learning models. Financial institutions routinely apply Shapley values to lending models, often using them to create adverse action notices that explain why loan requests were denied.

What Are Shapley Values

Shapley values provide a fair and axiomatically unique method of attributing the gains from a cooperative game.

A cooperative game is one where players collaborate to create value. Think of a company where employees come together to generate profit. The employees are the players and the profit is the gain. The game is specified such that we know the profit for any subset of the players.

Shapley values distribute this profit to participants. Think of it as an algorithm for distributing bonuses. We would want such an algorithm to be fair: the bonus must be commensurate with the employee's contribution.

Four Axioms That Make Shapley Values Unique

Shapley values are uniquely defined by four simple axioms.

Efficiency: The bonuses add up to the total profit. Nothing is lost or created in the distribution.

Dummy: An employee who does not contribute gets no profit. Non-contributors receive nothing.

Symmetry: Symmetric employees get equal profit. If two participants contribute identically, they receive identical attribution.

Linearity: If total profit comes from two separate departments, the bonus allocated to each employee should be the sum of bonuses from each department individually.

No other method besides Shapley values satisfies all these properties. This mathematical uniqueness gives practitioners confidence in the method's soundness.

The Algorithm

Shapley values are based on the concept of marginal contribution. Consider all possible orderings of employees. For each ordering, introduce employees in that order and note the change in total profit at each step. The average marginal contribution across all orderings is the employee's Shapley value.

Suppose there are three employees and the order is Alice, Bob, Carol. First add Alice and note the increase in profit. This is Alice's marginal contribution. Then add Bob with Alice already present and note the increase. This is Bob's marginal contribution. Repeat for all possible orderings and compute average marginal contributions.

Application to Machine Learning

A key question in explaining AI predictions is: why did the model make this prediction? Why did the lending model deny this applicant? Why is this text marked as negative sentiment? Why was this image labeled as a dog?

One way to answer is to quantify the importance of each input feature in the prediction. This is much like the bonus allocation problem. The prediction score is like the profit, and the features are like the employees. Quantifying feature importance is like bonus allocation, which can be carried out using Shapley values.

Challenges in Practice

Applying Shapley values to ML models introduces several challenges.

Defining Feature Absence

To compute Shapley values we need to measure the marginal contribution of a feature. This means knowing the model's prediction when a certain feature is absent. But how do we make a feature absent?

For text input, making a word absent is straightforward: drop it from the input. But what about tabular data? How do we make the income feature absent?

Different methods tackle this differently. Some replace the feature with a baseline value like the mean or median of the training distribution. Others model feature absence by sampling from a distribution and taking the expected prediction. The SHAP algorithm uses this sampling approach.

The choice of distribution is an important design decision that affects the attributions computed. Different distributions lead to different results.

Computational Cost

Going through all possible orderings is computationally expensive. With n features there are n! orderings. With some algebra we can reduce this to 2^n model invocations. This is still expensive.

Most approaches use sampling to make computation tractable. Sampling introduces uncertainty, so it becomes important to quantify uncertainty via confidence intervals over attributions.

Black-Box Compatibility

Computing Shapley values only requires input-output access to the model. We probe the model on counterfactual inputs. In this sense, it is a black-box explanation method. The model function need not be smooth, differentiable, or continuous.

This flexibility makes Shapley values applicable across model types. The same approach works for linear models, tree ensembles, and neural networks.

Implications for AI Governance

Shapley values have become central to responsible AI practices in regulated industries. Lending regulations require that credit denials be accompanied by reasons the applicant can understand. Shapley-based explanations provide a principled way to identify which factors drove a decision.

However, practitioners should understand the method's limitations. The choice of baseline distributions affects results. Sampling introduces uncertainty. And the explanations describe model behavior, not necessarily real-world causation.

Used thoughtfully, Shapley values provide a powerful tool for understanding AI decisions. Used carelessly, they can provide misleading explanations that give false confidence in model behavior.

The organizations that implement AI governance effectively will be those that understand both the power and the limitations of their explainability methods.