IEEE Transactions on Systems, Man, and Cybernetics Part A: Systems and Humans

An integrated data-driven framework for computing system management

View publication


With advancement in science and technology, computing systems are becoming increasingly more complex with a growing number of heterogeneous software and hardware components. They are thus becoming more difficult to monitor, manage, and maintain. Traditional approaches to system management have been largely based on domain experts through a knowledge acquisition solution that translates domain knowledge into operating rules and policies. This process has been well known as cumbersome, labor intensive, and error prone. In addition, traditional approaches for system management are difficult to keep up with the rapidly changing environments. There is a pressing need for automatic and efficient approaches to monitor and manage complex computing systems. In this paper, we propose an integrated data-driven framework for computing system management by acquiring the needed knowledge automatically from a large amount of historical log data. Specifically, we apply text mining techniques to automatically categorize the log messages into a set of canonical categories, incorporate temporal information to improve categorization performance, develop temporal mining techniques to discover the relationships between different events, and take a novel approach called event summarization to provide a concise interpretation of the temporal patterns. © 2009 IEEE.