Building Golden Signal Based Signatures for Log Anomaly Detection
As an increasing number of organizations migrate to the cloud, the main challenge before an operations team is how to effectively use an overwhelming amount of information derivable from multiple data sources like logs, metrics, and traces to help maintain the robustness and availability of cloud services. Site Reliability Engineers (SRE) depend on periodic log data to understand the state of an application and to diagnose the potential root cause of a problem. Despite best practices, service outages happen and result in the loss of billions of dollars in revenue. Many a times, indicators of these outages are buried in the flood of alerts which an SRE receives. Therefore, it is important to reduce noisy alerts so that an SRE can focus on what is critical. Log Anomaly Detection detects anomalous system behaviours and finds patterns (anomalies) in data that do not conform to expected behaviour. Different anomaly detection techniques have been incorporated into various AIOps platforms, but they all suffer from a large number of false positives. Also, some anomalies are transient and resolve on their own. In this paper, we propose an unsupervised model-agnostic persistent anomaly detector based on golden signal based signatures, as a post-processing filtering step on detected anomalies, so we don't have to interfere with the existing deployed anomaly detector in a system.