Explaining machine learning models

Lecture 24

Dr. Benjamin Soltoff

Cornell University
INFO 3312/5312 - Spring 2024

April 25, 2024



  • Project 02 peer review



Review: Explanation


Answer to the “why” question

Good explanations are

  • Contrastive: why was this prediction made instead of another prediction?
  • Selected: Focuses on just a handful of reasons, even if the problem is more complex
  • Social: Needs to be understandable by your audience
  • Truthful: Explanation should predict the event as truthfully as possible
  • Generalizable: Explanation could apply to many predictions
  • Explanation \(\leadsto\) local methods

Evaluating test set performance

Local explanations

Cornell University

Rows: 1
Columns: 12
$ state     <chr> "NY"
$ type      <fct> "Private, nonprofit"
$ admrate   <dbl> 0.0869
$ satavg    <dbl> 1510
$ cost      <dbl> 77047
$ netcost   <dbl> 29011
$ avgfacsal <dbl> 141849
$ pctpell   <dbl> 0.1737
$ comprate  <dbl> 0.9414
$ firstgen  <dbl> 0.154164
$ debt      <dbl> 13000
$ locale    <fct> City

Ithaca College

Rows: 1
Columns: 12
$ state     <chr> "NY"
$ type      <fct> "Private, nonprofit"
$ admrate   <dbl> 0.7773
$ satavg    <dbl> NA
$ cost      <dbl> 65274
$ netcost   <dbl> 33748
$ avgfacsal <dbl> 81369
$ pctpell   <dbl> 0.2029
$ comprate  <dbl> 0.7717
$ firstgen  <dbl> 0.1375752
$ debt      <dbl> 19500
$ locale    <fct> Suburb

Break it down

Breakdown methods

  • How contributions attributed to individual features change the mean model’s prediction for a particular observation
  • Sequentially fix the value of individual features and examine the change in the prediction

Breakdown of Cornell University using the random forest model

Break it down again

Breakdown of random forest

Breakdown plots


  • Easy to understand
  • Compact visualization
  • Intuitive explanation for linear models


  • Ignores interactive contributions (assumes everything is additive)
  • Ordering of the explanatory variables influences the breakdown and resulting explanation
  • Harder to interpret for models with lost of predictors

Shapley Additive Explanations (SHAP)

Shapley Additive Explanations (SHAP)

  • Average contributions of features are computed under different coalitions of feature orderings
  • Randomly permute feature order using \(B\) combinations
  • Average across individual breakdowns to calculate feature contribution to individual prediction

SHAP for Cornell

Cornell University vs. Ithaca College

Cornell University vs. Ithaca College

Shapley values


  • Model-agnostic
  • Strong formal foundation from game theory
  • Considers all (or many) possible feature orderings


  • Ignores interactive contributions (assumes everything is additive)
  • Larger number of predictors makes it impossible to consider all possible coalitions
  • Computationally expensive

Local interpretable model-agnostic explanations (LIME)


Local interpretable model-agnostic explanations

  • Global \(\rightarrow\) local
  • Interpretable model used to explain individual predictions of a black box model
  • Assumes every complex model is linear on a local scale
  • Simple model explains the predictions of the complex model locally
    • Local fidelity
    • Does not require global fidelity
  • Works on tabular, text, and image data


LIME procedure

  1. For each prediction to explain, permute the observation \(n\) times
  2. Let the complex model predict the outcome of all permuted observations
  3. Calculate the distance from all permutations to the original observation
  4. Convert the distance to a similarity score
  5. Select \(m\) features best describing the complex model outcome from the permuted data
  6. Fit a simple model to the permuted data, explaining the complex model outcome with the \(m\) features from the permuted data weighted by its similarity to the original observation
  7. Extract the feature weights from the simple model and use these as explanations for the complex models local behavior

\(10\) nearest neighbors

Random forest

Binning continuous variables



  • Can choose different local surrogates (e.g. regression model, decision tree)
  • Local surrogates have their own set of interpretable features
  • Explanations tend to be short and (maybe) contrastive
  • Works for tabular data, text, and images


  • Hard to define the local “neighborhood”
  • Explanations tend to be unstable

Application exercise


  • Go to the course GitHub org and find your ae-21 (repo name will be suffixed with your GitHub name).
  • Clone the repo in RStudio Workbench, open the Quarto document in the repo, and follow along and complete the exercises.




Additional resources