Non-discrimination is a recognized objective in algorithmic decision making. In this paper, we introduce a novel probabilistic formulation of data pre-processing for reducing discrimination. We propose a convex optimization for learning a data transformation with three goals: controlling group discrimination, limiting distortion in individual data samples, and preserving utility. Several theoretical properties are established, including conditions for convexity, a characterization of the impact of limited sample size on discrimination and utility guarantees, and a connection between discrimination and estimation. Two instances of the proposed optimization are applied to datasets, including one on real-world criminal recidivism. Results show that discrimination can be greatly reduced at a small cost in classification accuracy and with precise control of individual distortion.