Performance management via adaptive thresholds with separate control of false positive and false negative errors
Abstract
Component level performance thresholds are widely used as a basic means for performance management. As the complexity of managed systems increases, manual threshold maintenance becomes a difficult task. This may result from a) a large number of system components and their operational metrics, b) dynamically changing workloads, and c) complex dependencies between system components. To alleviate this problem, we advocate that component level thresholds should be computed, managed and optimized automatically and autonomously. To this end, we have designed and implemented a performance threshold anagement sub-system that automatically and dynamically computes two separate component level thresholds: one for controlling Type I errors and another for controlling Type II errors. We present the theoretical foundation for this autonomic threshold management system, describe a specific algorithm and its implementation, and evaluate it using real-life scenarios and production data sets. As our present study shows, with proper parameter tuning, our on-line dynamic solution is capable of nearly optimal performance thresholds calculation. © 2009 IEEE.