DOFramework: A testing framework for decision optimization model learners

Mathematical decision-optimization (DO) models provide business decision support for a wide range of scenarios. Take for example the winners and finalists of INFORM’s Franz Edelman Award from recent years. Those include the U.S. Census Bureau (Enumerator Assignments, $2.5B), the U.S. Federal Communications Commission (Wireless Service Auction, $20B), Vattenfall BA Wind (Wind Power Output, $170M), and Deutsche Bahn (Train Rotation Planning) among others.

In a mathematical DO model, often, hard-to-model terms can be learned from data. Learning, however, may result in a DO model that fails to capture the actual system that produced the data, leading to poor recommendations. For this purpose, we introduce DOFramework¹ as a playground for researchers and data scientists to test, tune, and compare novel approaches for integrating learning into DO models.

DOFramework generates multiple instances of optimization problems, feeds them to the user’s DO model learner, and collects its predicted optima. By comparing the predicted optimal values against the ground truth, DOFramework delivers a comprehensive prediction profile of the user’s DO model learner. Similarly to Gymnasium², which offers environments to develop and compare reinforcement learning (RL) algorithms, DOFramework addresses the need for benchmark playgrounds. DOFramework, however, samples problems at random rather than rely on a fixed library of those.

Learned DO models

A DO model is a mathematical formulation that incorporates an objective to optimize under constraints

x^*∈ \mathrm{arg}⁡ \min\limits_{x∈\mathbb{R}^d} f(x)

s.t. \ \ g(x)≤0

The objective in this formulation is $f$ , while the constraints are given by the inequality $g(x) ≤ 0$ . We optimize over all realizations of the decision variable. The goal is to solve the optimization problem by finding an optimum $x^*$ .

A DO model learner is an algorithm, routine, or pipeline that incorporates learning and knowledge into a DO model that integrates both. Its outcome is a learned DO model.

Integrating machine learning (ML) models into DO models offers tremendous opportunities whenever objective ( $f$ ) or constraint ( $g$ ) terms are unknown, but data is available.³ Unlike many existing works that assume knowledge of the DO model and focus on predicting its parameters,⁴^{, 5} DO model learning, also known as Optimization with Constraint Learning⁶ or Empirical Model Learning,⁷ seeks to embed an ML model into a DO model, considering (some of) its decision variables as features of the ML model.⁸^{, 9}^{, 10}

Optimization problem instances

At its core, DOFramework generates synthetic optimization problem instances at random. An optimization problem instance is a tuple $(f, Ω, D, x^*)$ that consists of a continuous piece-wise linear function $f$ a bounded convex polytope $Ω = \{x | g(x)≤0\}$ (meaning $g$ is affine), a dataset $D = (x,y)$ associated with $f$ (that is, $y = f(x)$ + error), and the optimum $x^*$ of $f$ in $Ω$ . The function is the objective target of the optimization problem, while the polytope provides the constraints. The ground-truth optimal value of the optimization problem instance is. The key algorithmic challenge of DOFramework lies in producing $(f, Ω, D, x^* )$ so that is known combinatorially rather than analytically.

Prediction profile

A DO model learner solves an optimization problem instance when it produces a learned DO model that fits the ground truth best. The best learned DO model will produce a learned optimum $x̂$ whose true value $f(x̂)$ is (or, more likely, close to) the true optimal value $f(x^*)$ . A DO model learner acts as a black box that takes in data and knowledge (namely, those known terms of the DO model) and spits out a learned optimum $x̂$ .

By comparing $f(x̂)$ to $f(x^*)$ , over many generated optimization problem instances, DOFramework derives a prediction profile of the DO model learner. To put this more mathematically, DOFramework provides an empirical distribution that allows the user to estimate the probability

Pr[f(x̂)- f_{min} < ϵ(f_{max} - f_{min}) | Ω] (†)

for any $0<ϵ<1$ . It therefore quantifies how good we expect the learned optima to be. When considering minimization, the closer $(†)$ is to 1 the better, and the closer $ϵ$ is to 0, the harder it would be to achieve.

Here is an example of two prediction profiles derived by DOFramework. The prediction profile on the right comes from a baseline DO model learner that combines linear regression together with the PuLP solver.¹¹ The prediction profile on the left comes from OptiCL¹² with Trust Region. The comparison is done across “low” $(d = 2,3)$ and “high” $(d = 5,7)$ dimension problems.

Screenshot 2023-01-25 at 11.56.54 AM.png

We can see how OptiCL performs better over the baseline DO model learner: it is more consistent and therefore more reliable, though more conservative. We can quantify this observation by estimating the probability $(†)$ for various values, using these empirical distributions.

Baseline				OptiCL
d low		d high		d low		d high
	Pr		Pr		Pr		Pr
0.1	0.19	0.1	0.06	0.1	0.21	0.1	0.02
0.2	0.37	0.2	0.17	0.2	0.50	0.2	0.14
0.3	0.49	0.3	0.30	0.3	0.70	0.3	0.35
0.4	0.58	0.4	0.43	0.4	0.81	0.4	0.58

Could-distributed design

DOFramework was designed for scale, performance, robustness, and reproducibility, while keeping it agnostic to the user’s DO model learner, integrated as a black box. Combining Ray¹³ and Rayvens¹⁴ together with Cloud Object Storage (COS) allowed us to achieve exactly that: a fully-distributed, event-driven implementation that deploys easily on a Kubernetes cluster (IBM Cloud or AWS). DOFramework can be deployed locally as well, replacing COS with local storage. All we need is a storage “bucket,” whether an S3 bucket or a local folder.

The cloud-distributed computing framework Ray provides simple primitives for distributed applications. Rayvens provides abstractions for event management built on top of Ray. Combined with Ray, Rayvens provides a powerful tool for integrating event-driven stream processing together with cloud distribution.

Here is how we use Ray and Rayvens in practice. Once uploaded to the meta bucket, DOFramework input files (with essential metadata to generate problems) trigger Rayvens events: Rayvens feeds these inputs to a processor that uses them to generate objectives $f$ (with constraints $Ω$ ). These generation tasks are distributed asynchronously via Ray. Rayvens then streams each generated objective $f$ (with constraints $Ω$ ) as a file into the objectives bucket. Once objective files land in the objectives bucket, they trigger new events, again, handled by Rayvens, which feeds them into a second processor. This second processor uses each newly created objective $f$ (with constraints $Ω$ ) to generate datasets $D$ . Rayvens then streams each dataset $D$ as a file into the data bucket. Then the pattern repeats itself: when dataset files land in the data bucket, they trigger events consumed by a third processor. This time it’s the user’s DO model learner, which takes in the dataset $D$ (together with its associated constraints $Ω$ ) and produces a learned optimum $x̂$ . Each learned optimum is uploaded over the Rayvens stream to the final estimates bucket, where it’s ready for further analysis.

After the DO model learner has solved enough optimization problem instances, we are ready to derive its prediction profile and estimate how well it did $(†)$ .

Screenshot 2023-01-25 at 12.10.37 PM.png

To learn more about our work, join our Optimization with Constraint Learning AAAI 2023 Lab on February 8th, 2023, or check out our GitHub OCL Lab repo. There you will find Jupyter Notebooks demonstrating OptiCL and DOFramework, together with detailed installation instructions for OCL Lab materials.

Subscribe to our Future Forward newsletter and stay up to date on the latest research news

Subscribe to our newsletter

References

Davidovich, O., Bercea, G.-T., & Wasserkrug, S. (2022). The Good, the Bad, and the Outliers: A Testing Framework for Decision Optimization Model Learning. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD ’22). Association for Computing Machinery, New York, NY, USA, 2811–2821. https://doi.org/10.1145/3534678.3539094 ↩
Farama Foundation (2022-). Gymnasium: A standard API for reinforcement learning. Gymnasium. https://github.com/Farama-Foundation/Gymnasium ↩
Fischetti, M., & Fraccaro, M. (2019). Machine learning meets mathematical optimization to predict the optimal production of offshore wind parks. Computers and Operations Research, 106(C), 289–297. https://doi.org/10.1016/j.cor.2018.04.006 ↩
Bertsimas, D., & Kallus, N. (2020). From predictive to prescriptive analytics. Management Science, 66(3), 1025–1044. https://doi.org/10.1287/mnsc.2018.3253 ↩
Elmachtoub, A. N., & Grigas, P., (2022). Smart “predict, then optimize”. Management Science, 68(1), 9–26. https://doi.org/10.1287/mnsc.2020.3922 ↩
Fajemisin, A., Maragno, D., & den Hertog, D. (2021). Optimization with Constraint Learning: A Framework and Survey. Arxiv eprint 2110.02121. https://arxiv.org/abs/2110.02121 ↩
Lombardi, M., Milano, M., Bartolini, A. (2017). Empirical decision model learning. Artificial Intelligence, 244, 343–367, https://doi.org/10.1016/j.artint.2016.01.005 ↩
Biggs, M., Hariss, R., & Perakis, G. (2017). Optimizing Objective Functions Determined from Random Forests. http://dx.doi.org/10.2139/ssrn.2986630 ↩
Verwer, S., Zhang, Y., & Ye, Q. C. (2017). Auction Optimization Using Regression Trees and Linear Models as Integer Programs. Artificial Intelligence, 244, 368–395. https://doi.org/10.1016/j.artint.2015.05.004 ↩
Villarrubia, G., De Paz, J. F., Chamoso, P., & De la Prieta., F. (2018). Artificial neural networks used in optimization problems. Neurocomput, 272(C), 10–16. https://doi.org/10.1016/j.neucom.2017.04.075 ↩
Mitchell, S., Kean, A., Mason, A. O’Sullivan, M., Phillips, A., & Peschiera, F. (2009-). Optimization with PuLP. PuLP. https://coin-or.github.io/pulp/index.html ↩
Maragno, D., & Wiberg, H. (2021-). OptiCL: An end-to-end framework for mixed-integer optimization with data-driven learned constraints. OptiCL. https://github.com/hwiberg/OptiCL ↩
Moritz, P., Nishihara, R., Wang, S., Tumanov, A., Liaw, R., Liang, E., Elibol, M., Yang, Z., Paul, W., Jordan, M. I. , & Stoica, I. (2018). Ray: A Distributed Framework for Emerging AI Applications. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18). USENIX Association, Carlsbad, CA, 561–577. https://dl.acm.org/doi/10.5555/3291168.3291210 ↩
Bercea, G.-T., & Tardieu, O. (2021). Rayvens: Event sources and sinks on Ray. Rayvens. https://github.com/project-codeflare/rayvens ↩

How the IBM Research AI Hardware Center is building tomorrow’s processors
Deep Dive
Peter Hess
12 Aug 2025
All decisions have trade-offs. IBM’s Wei Sun is an expert at weighing them
Q & A
Kim Martineau
06 Aug 2025
IBM Storage Scale delivers real-world performance: an in-depth analysis
Technical note
Brian Belgodere, Chris Miller, John Lewars, Matthew Klos, Yukio Hayashi Leon, Mara Miranda Bautista, and Olaf Weiser
04 Aug 2025
- AI
- Hybrid Cloud Infrastructure
Debugging LLMs to improve their credibility
Research
Kim Martineau
30 Jul 2025

Learned DO models

Optimization problem instances

Prediction profile

Could-distributed design

References

Related posts

How the IBM Research AI Hardware Center is building tomorrow’s processors

All decisions have trade-offs. IBM’s Wei Sun is an expert at weighing them

IBM Storage Scale delivers real-world performance: an in-depth analysis

Debugging LLMs to improve their credibility