Publication
SYSTOR 2024
Conference paper

ARISE: AI Right Sizing Engine for AI workload configurations

View publication

Abstract

Data scientists and platform engineers who maintain AI stacks are required to continuously run AI workloads. When executing any part of the AI pipeline, whether data preprocessing, training, fine-tuning or inference, a frequent question is how to optimally configure the environment to meet Service Level Objectives (SLOs), such as desired throughput, runtime deadlines, and avoid memory and CPU exhaustion. We present ARISE, a tool that enables making data-driven decisions about AI workload configuration questions. ARISE trains performance prediction machine-learning regression models on historical workloads and performance benchmark metadata, and then predicts the performance of future workloads based on their input metadata, using the best performing regression models. Initial evaluation of ARISE on real-world workloads shows high prediction accuracy.

Date

Publication

SYSTOR 2024

Share