Detecting and identifying system changes in the cloud via discovery by example
Abstract
Discovering and identifying system changes caused by events such as software installation and updates, configuration changes, and security patches are important functionalities for change management, security, compliance and problem diagnosis in emerging cloud platforms. Currently, most discovery tools use manually written rules, which require specific knowledge of software and systems. Approaches based on manually written rules are often fragile and require constant maintenance in this era of continuous integration. In this paper, we propose a novel 'discovery by example' approach to autonomously search for and identify system changes. Our approach learns characteristic features of system changes automatically, without requiring any explicit rule definitions or specific knowledge of the underlying software or systems. In this approach, given a system change, our method searches a repository that contains previous stored system changes and returns those that are similar to it. We further explore the use of various forms of 'fingerprints' to represent system changes efficiently and faithfully in a compact manner. We propose and evaluate two types of fingerprints: the 'basename fingerprint' and the '1-D histogram fingerprint'. We show that both fingerprints exhibit different efficiency and accuracy trade-offs, and they can be effectively employed in different use cases. We evaluate the performance of our approach with both techniques and further present an application of it in system real-time streaming monitoring.