About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
INFORMS 2023
Invited talk
Scalable Optimal Multiway-Split Decision Trees
Abstract
There has been a surge of interest in learning optimal decision trees using mixed-integer programs (MIP) as heuristic-based methods do not guarantee optimality and find it challenging to incorporate practical operational constraints. Existing MIP methods rely on an arc-based binary-tree formulations that are limited to datasets having a few thousand samples, sample-level constraints, and linear metrics. We propose a novel path-based MIP model that can be solved using a scalable column generation framework to produce a multiway-split tree that is more interpretable due to its shorter rules. Our method can optimize nonlinear metrics like F1 score and satisfy a broader class of constraints. We demonstrate its efficacy with extensive experiments including a million-sample dataset and report up to a 24X reduction in runtime compared to the state-of-art MIP-based methods.