Optimizing Cloud Workloads: Autoscaling with Reinforcement Learning
Abstract
By 2027, over 50% of enterprises are expected to adopt industry cloud platforms, driving potential EBITDA value of $3 trillion by 2030. In this landscape, software providers rely on Infrastructure-as-a-Service (IaaS) providers to access tailored virtualized resources based on their usage. Optimizing resource utilization is crucial to reduce operating costs and maintain quality standards. Dynamic scaling mechanisms are essential to adjust resources according to workload variations. The Horizontal Pod Autoscaler (HPA) has limitations in scaling applications based on CPU utilization. However, AI-based algorithms, particularly Reinforcement Learning (RL), offer promising solutions. AI-based methods excel in overcoming fixed parameter constraints, handling sudden load spikes, and supporting custom parameters. We present a RL-based framework for autoscaling applications, demonstrating results from experimental evaluation.