Shared Interest: Human Annotations vs. AI Saliency

Angie Boggust; Benjamin Hoover; Arvind Satyanarayan; Hendrik Strobelt

NeurIPS 2020

Demo paper

06 Dec 2020

Shared Interest: Human Annotations vs. AI Saliency

View publication

Abstract

As deep learning is applied to high stakes scenarios, it is increasingly important that a model is not only making accurate decisions, but doing so for the right reasons. Common explainability methods provide pixel attributions as an explanation for a model's decision on a single image; however, using input-level explanations to understand patterns in model behavior is challenging for large datasets as it requires selecting and analyzing an interesting subset of inputs. Utilizing human generated ground truth object locations, we introduce metrics for ranking inputs based on the correspondence between the input’s ground truth location and the explainability method’s explanation region. Our methodology is agnostic to model architecture, explanation method, and dataset allowing it to be applied to many tasks. We demo our method on two high profile scenarios: a widely used image classification model and a melanoma prediction model, showing it surfaces patterns in model behavior by aligning model explanations with human annotations.

Paper