Evaluation of Causal Inference Techniques for AIOps

Vijay Arya; Karthikeyan Shanmugam; Pooja Aggarwal; Qing Wang; Prateeti Mohapatra; Seema Nagar

doi:10.1145/3430984.3431027

CODS-COMAD 2021

Conference paper

02 Jan 2020

Evaluation of Causal Inference Techniques for AIOps

View publication

Abstract

Inferring causality of events from log data is critical to IT operations teams who continuously strive to identify probable root causes of events in order to quickly resolve incident tickets so that downtimes and service interruptions are kept to a minimum. Although prior work has applied some specific causal inference techniques on proprietary log data, they fail to benchmark the performance of different techniques on a common system or dataset. In this work, we evaluate the performance of multiple state-of-the-art causal inference techniques using log data obtained from a publicly available benchmark microservice system. We model log data both as a timeseries of error counts and as a temporal event sequence and evaluate 3 families of Granger causal techniques: regression based, independence testing based, and event models. Our preliminary results indicate that event models yield causal graphs that have high precision and recall in comparison to regression and independence testing based Granger methods.

Conference paper