Reliable Counterfactual Explanations for Autoencoder based Anomalies

Swastik Haldar; Philips George John; Diptikalyan Saha

doi:10.1145/3430984.3431015

CODS-COMAD 2021

Conference paper

02 Jan 2020

Reliable Counterfactual Explanations for Autoencoder based Anomalies

View publication

Abstract

Autoencoders have been used successfully for tackling the problem of anomaly detection in an unsupervised setting, and are often known to give better results than traditional approaches such as clustering and subspace-based (linear) methods. A data point is flagged as anomalous by an autoencoder if its reconstruction loss is higher than an appropriate threshold. However, as with other deep learning models, the increased accuracy offered by autoencoders comes at the cost of interpretability. Explaining an autoencoder's decision to flag a particular data point as an anomaly is greatly important, since a human-friendly explanation would be necessary for a domain expert tasked with evaluating the model's decisions. We consider the problem of finding counterfactual explanations for autoencoder anomalies, which address the question of what needs to be minimally changed in a given anomalous data point to make it non-anomalous. We present an algorithm that generates a diverse set of proximate counterfactual explanations for a given autoencoder anomaly. We also introduce the notion of reliability of a counterfactual, and present techniques to find reliable counterfactual explanations.

Workshop