Publication
CLOUD 2024
Conference paper

Process-based Efficient Power Level Exporter

Abstract

In this paper, we present the Kepler framework, designed to address the critical need for precise power and energy measurement in on-prem cloud-native, containerized environments, with a specific focus on processes, containers, and Kubernetes pods. The framework aims to support other tools in making informed decisions regarding provisioning, scheduling, and energy-optimization in cloud environments. Our approach involves leveraging the Kepler framework to create power models using Hardware Counters (HC), and real-time system power metrics from hardware sensors like x86 Running Average Power Limit (RAPL). Unlike previous methods that create and validate power models using aggregated system metrics, we propose a versatile process-level power model trained with per-process metrics. Those metrics are collected via a series of experiments in a controlled environment, measuring the incremental power consumption of processes under different scenarios. The collected data is then utilized to create a power model to be used in a shared cloud environment, and to validate the created power models using different set of input metrics. Our results show a significant improvement in the model accuracy compared to prior works, when incorporating per-process metrics and real-time system power metrics into the power estimation process. For instance, using the simplest power model, which is based on CPU utilization ratio, resulted in a Sum of Squared Error (SSE) of 75. In contrast, a power model created using aggregated system metrics, as the related works, had an SSE of 175 without real-time power metrics, and 5.6 with our proposed model refinement by normalizing the model results with the real-time system power metrics. On the other hand, training the power model with per-process metrics from controlled experiments yielded an SSE as low as 1.68 using real-time system power metrics, representing a 70\% improvement in model accuracy compared to using aggregated system metrics, and an SSE 8.7 without power metrics, representing a 95\% improvement in model accuracy. Furthermore, the results show that Kepler has a notable lower overhead by utilizing extended Berkeley Packet Filter (eBPF) for HC collection than alternative methods.