About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
SCC 2017
Conference paper
Lightweight and Adaptive Service API Performance Monitoring in Highly Dynamic Cloud Environment
Abstract
Cloud platforms and services usually provide an APIlayer as decoupled, language agnostic interface for both front-endclient integration and back-end data and/or function access. Theavailability and performance of the APIs have significant impact onthe quality of end user or client experiences due to its nature ofinteraction endpoints. However, the extreme dynamics, complexityand scale of the current cloud platforms challenge the applicabilityof the existing performance monitoring and anomaly detection approachesfrom timeliness, accuracy, and scalability perspectives. Thispaper presents a novel approach to API performance monitoring,which recognizes performance problems by response time deviationfrom a baseline response time / throughput model that are createdand continuously updated through online learning. In the postdetectionphase, an MIC (Maximal Information Criteria) basedcorrelation algorithm is used to group alerts into a higher leveland more informative hyper-Alerts for end user notification. Weprototyped our solution for a large-scale commercial cloud platform,evaluated it using three months' API performance metrics data,and compared with a couple of existing representative algorithmsand tools. The results show our approach is able to detect APIperformance anomalies with a high F1-score. Compared to existingGranger based approach, our approach has achieved nearly onetime increase in F1-score. Moreover, the alert reduction ratio of ourapproach outperforms several state-of-The-Art approaches.