About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
NeurIPS 2023
Workshop paper
Cost-Aware Counterfactuals for Black Box Explanations
Abstract
Counterfactual explanations provide actionable insights into the minimal change in a system that would lead to a more desirable prediction from a black box model. We address the challenges of finding valid and low cost counterfactuals in the setting where there is a different cost or preference for perturbing each feature. We propose a multiplicative weight approach that is applied on the perturbation, and show that this simple approach can be easily adapted to obtain multiple diverse counterfactuals, as well as to integrate the importance features obtained by other state of the art explainers to provide counterfactual examples. Additionally, we discuss the computation of valid counterfactuals with numerical gradient-based methods when the black box model presents flat regions with no reliable gradient. In this scenario, sampling approaches, as well as those that rely on available data, sometimes provide counterfactuals that may not be close to the decision boundary. We show that a simple long-range guidance approach, which consist of sampling from a larger radius sphere in search of a direction of change for the black box predictor when no gradient is available, improves the quality of the counterfactual explanation. In this work we discuss existing approaches, and show how our proposed alternatives compares favorably on different datasets and metrics.