LogSed: Anomaly Diagnosis through Mining Time-Weighted Control Flow Graph in Logs
Abstract
Detecting execution anomalies is very important to monitoring and maintenance of cloud systems. People often use execution logs for troubleshooting and problem diagnosis, which is time consuming and error-prone. There is great demand for automatic anomaly detection based on logs. In this paper, we mine a time-weighted control flow graph (TCFG) that captures healthy execution flows of each component in cloud, and automatically raise anomaly alerts on observing deviations from TCFG. We outlined three challenges that are solved in this paper, including how to deal with the interleaving of multiple threads in logs, how to identify operational logs that do not contain any transactional information, and how to split the border of each transaction flow in the TCFG. We evaluate the effectiveness of our approach by leveraging logs from an IBM public cloud production platform and two simulated systems in the lab environment. The evaluation results show that our TCFG mining and anomaly diagnosis both perform over 80% precision and recall on average.