Data pre-processing for discrimination prevention: Information-theoretic optimization and analysis

Flavio Du Pin Calmon; Dennis Wei; Bhanukiran Vinzamuri; Karthikeyan Natesan Ramamurthy; Kush R. Varshney

doi:10.1109/JSTSP.2018.2865887

IEEE JSTSP

Paper

01 Oct 2018

Data pre-processing for discrimination prevention: Information-theoretic optimization and analysis

View publication

Abstract

Non-discrimination is a recognized objective in algorithmic decision making. In this paper, we introduce a novel probabilistic formulation of data pre-processing for reducing discrimination. We propose a convex optimization for learning a data transformation with three goals: controlling group discrimination, limiting distortion in individual data samples, and preserving utility. Several theoretical properties are established, including conditions for convexity, a characterization of the impact of limited sample size on discrimination and utility guarantees, and a connection between discrimination and estimation. Two instances of the proposed optimization are applied to datasets, including one on real-world criminal recidivism. Results show that discrimination can be greatly reduced at a small cost in classification accuracy and with precise control of individual distortion.

Conference paper