About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
IM 2017
Conference paper
Using residual resource consumption to resample top-k monitoring reports
Abstract
Top-k reports are compound metrics that provide useful information when diagnosing problems in a system, e.g., to identify persistent CPU usage by a process. In large systems, these reports are collected at regular intervals and need to be resampled to a coarser granularity to answer user queries for different sampling periods, or to save space and make it possible to keep historical data for long term performance analysis. However, resampling top-k reports, i.e., aggregating several reports collected for small time intervals into a single top-k report can introduce inaccuracies. For example, a process that consistently uses CPU over the aggregation interval but did not make it to the short term top-k reports will be missing from the aggregated report. In this paper, we present an algorithm that collects top-k reports at regular intervals and can aggregate them with little or no error. This is done by including residual resource consumption of unreported, but potentially significant entities in the top-k reports, and using these residual values during aggregation. We show different approaches to including residual resource consumption in individual top-k reports, analyze the error introduced, and demonstrate the effectiveness of the algorithm in real-world scenarios.