LogInsights: Understanding and Extracting Information from Logs for Fast Fault Classification by Weak Supervision

Suranjana Samanta; Prateeti Mohapatra; Fabian Lim; Meenakshi Madugula; Xiaotong Liu; Sarasi Lalithsena

doi:10.1109/SSE60056.2023.00014

Publication

SSE 2023

Conference paper

LogInsights: Understanding and Extracting Information from Logs for Fast Fault Classification by Weak Supervision

SSE 2023

View publication

Abstract

In many real-world applications, labeled training data is hard to come by for text classification. These tasks are often domain specific, where the vocabulary of the textual input is different than that of the general language vocabulary. In this paper, we deal with one of such tasks of automation of a software monitoring system, where logs are analyzed in real-Time. We describe a weakly supervised method to process incoming streams of logs for identifying fault types in logs. We propose hand-crafted feature extractions, specially designed for the classifiers for log inputs. In order to make the processing time efficient and generalizable across various log sources, we rely on a weak supervised fault classifier, where the domain knowledge is incorporated using a word embedding mode built on a domain specific corpus. Experiments on logs obtained from various applications show the efficacy of our proposed method.

Date

02 Jul 2023

Publication

SSE 2023

Authors

IBM-affiliated at time of publication

Abstract

Date

Publication

Authors

Topics

Share