GAUL: Gestalt analysis of unstructured logs for diagnosing recurring problems in large enterprise storage systems
Abstract
We present GAUL, a system to automate the whole log comparison between a new problem and the ones diagnosed in the past to identify recurring problems. GAUL uses a fuzzy match algorithm based on the contextual overlap between log lines and efficiently implements this using scalable index/search. The accuracy and efficiency of the comparison is further improved by leveraging problem set information and noise tolerance techniques. We evaluate GAUL using 4339 customer problems that occurred in all field deployments of an enterprise storage system over the course of a year. Our results show that with human-filtered logs, GAUL can identify the correct problem set 66% of the time among the top10 matches, which is 15% more accurate than the VSM system that uses cosine similarity and 19% more accurate than the ERRCMP system that uses error codes for log comparison. With unfiltered logs, the top10 match accuracy of GAUL is 40%, which is 22% more accurate than VSM and 26% more accurate than ERRCMP. © 2010 IEEE.