A logistic regression model for small sample classification problems with hidden variables and non-linear relationships: An application in business analytics
Abstract
Logistic regression is one of the frequently used models in pattern recognition, especially in binary classification tasks. We focus on a class of small-sample classification problems where logistic regression seems to be a "natural" choice for the classifier, yet its direct application yields sub-optimal results. Specifically, we consider cases when: 1) input-output relationships are non-linear, 2) there is a need to estimate hidden states or auxiliary variables in the model, and 3) the training set is small preventing the use of more sophisticated techniques. We first describe an approach to compute the parameters of the regression, which addresses the issue of estimating hidden variables. We then describe a recursive adaptation procedure that identifies the most significant non-linear relationships in the data and adapts the model by introducing corresponding higher-order terms. The performance of the method is tested in a business modeling application, demonstrating significant improvements over the traditional classifiers. © 2005 IEEE.