Publication
Nature Methods
Paper

The class imbalance problem

View publication

Abstract

We previously discussed how classifiers based on logistic regression and decision trees can be used for predicting the class of an observation. Unfortunately, when such classifiers are trained on a dataset in which one of the response classes is rare, they can underestimate the probability of observing a rare event — the greater the imbalance, the greater this small-sample bias. This month, we illustrate how to mitigate the negative effect of class imbalance on the training of classifiers.

Date

Publication

Nature Methods