MonoNet: Towards Interpretable Models by Learning Monotonic Features

An-Phi Nguyen; Maria Rodriguez Martinez

NeurIPS 2019

Workshop paper

13 Dec 2019

MonoNet: Towards Interpretable Models by Learning Monotonic Features

Download paper

Abstract

Being able to interpret, or explain, the predictions made by a machine learning model is of fundamental importance. This is especially true when there is interest in deploying data-driven models to make high-stakes decisions, e.g. in healthcare. In this paper, we claim that the difficulty of interpreting a complex model stems from the existing interactions among features. By enforcing monotonicity between features and outputs, we are able to reason about the effect of a single feature on an output independently from other features, and consequently better understand the model. We show how to structurally introduce this constraint in deep learning models by adding new simple layers. We validate our model on benchmark datasets, and compare our results with previously proposed interpretable models.

Conference paper