An Approach for Anomaly Diagnosis Based on Hybrid Graph Model with Logs for Distributed Services
Detecting runtime anomalies is very important to monitoring and maintenance of distributed services. People often use execution logs for troubleshooting and problem diagnosis manually, which is time consuming and error-prone. In this paper, we propose an approach for automatic anomaly detection based on logs. We first mine a hybrid graph model that captures normal execution flows inter and intra services, and then raise anomaly alerts on observing deviations from the hybrid model. We evaluate the effectiveness of our approach by leveraging logs from an IBM public cloud production platform and two simulated systems in the lab environment. Evaluation results show that our hybrid graph model mining performs over 80% precision and 70% recall and anomaly detection performs nearly 90% precision and 80% recall on average.