Trucks Don’t Mean Trump: Diagnosing Human Error in Image Analysis

  • J.D. Zamfirescu-Pereira ,
  • Jerry Chen ,
  • Emily Wen ,
  • Allison Koenecke ,
  • Nikhil Garg ,
  • Emma Pierson

FAccT 2022 |

Algorithms provide powerful tools for detecting and dissecting human bias and error. Here, we develop machine learning methods to analyze how humans err in a particular high-stakes task: image interpretation. We leverage a unique dataset of 16,135,392 human predictions of whether a neighborhood voted for Donald Trump or Joe Biden in the 2020 US election, based on a Google Street View image. We show that by training a machine learning estimator of the Bayes optimal decision for each image, we can provide an actionable decomposition of human error into bias, variance, and noise terms, and further identify specific features (like pickup trucks) which lead humans astray. Our methods can be applied to ensure that human-in-the-loop decision-making is accurate and fair and are also applicable to black-box algorithmic systems.