Automated system change discovery and management in the cloud
Emerging cloud service platforms are hosting hundreds of thousands of virtual machine instances, each of which evolves differently from the time they are provisioned. As a result, cloud service operators are facing great challenges in continuously managing, monitoring, and maintaining a large number of diversely evolving systems, and discovering potential resilience and vulnerability issues in a timely manner. In this paper, we introduce an automated cloud analytics solution that is based on using machine learning for system change discovery and management. The learning-based approaches we introduce are widely used in multimedia and web content analysis, but application of these to the cloud management context is a novel aspect of our work. We first propose multiple feature extraction methods to generate condensed 'fingerprints' from the comprehensive system metadata recorded during the system changes. We then build an adaptive knowledge base using all known fingerprint samples. We evaluate different machine learning algorithms as part of the proposed discovery and identification framework. Experimental results that are gathered from several real-life systems demonstrate that our approach is fast and accurate for system change discovery and management in emerging cloud services.