Publication
INFORMS 2023
Invited talk

Scalable Optimal Multiway-Split Decision Trees

View publication

Abstract

There has been a surge of interest in learning optimal decision trees using mixed-integer programs (MIP) as heuristic-based methods do not guarantee optimality and find it challenging to incorporate practical operational constraints. Existing MIP methods rely on an arc-based binary-tree formulations that are limited to datasets having a few thousand samples, sample-level constraints, and linear metrics. We propose a novel path-based MIP model that can be solved using a scalable column generation framework to produce a multiway-split tree that is more interpretable due to its shorter rules. Our method can optimize nonlinear metrics like F1 score and satisfy a broader class of constraints. We demonstrate its efficacy with extensive experiments including a million-sample dataset and report up to a 24X reduction in runtime compared to the state-of-art MIP-based methods.

Date

15 Oct 2023

Publication

INFORMS 2023