About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
SYSTOR 2024
Conference paper
ARISE: AI Right Sizing Engine for AI workload configurations
Abstract
Data scientists and platform engineers who maintain AI stacks are required to continuously run AI workloads. When executing any part of the AI pipeline, whether data preprocessing, training, fine-tuning or inference, a frequent question is how to optimally configure the environment to meet Service Level Objectives (SLOs), such as desired throughput, runtime deadlines, and avoid memory and CPU exhaustion. We present ARISE, a tool that enables making data-driven decisions about AI workload configuration questions. ARISE trains performance prediction machine-learning regression models on historical workloads and performance benchmark metadata, and then predicts the performance of future workloads based on their input metadata, using the best performing regression models. Initial evaluation of ARISE on real-world workloads shows high prediction accuracy.