Probabilistic text analytics framework for information technology service desk tickets
Abstract
Ticket annotation and search has become an essential research subject for the successful delivery of IT operational analytics. Millions of tickets are created yearly to address business users' IT related problems. In IT service desk management, it is critical to first capture the pain points for a group of tickets to determine root cause; secondly, to obtain the respective distributions in order to layout the priority of addressing these pain points. An advanced ticket analytics system utilizes a combination of topic modeling, clustering and Information Retrieval (IR) technologies to address the above issues and the corresponding architecture which integrates of these features will allow for a wider distribution of this technology and progress to a significant financial benefit for the system owner. Topic modeling has been used to extract topics from given documents; in general, each topic is represented by a unigram language model. However, it is not clear how to interpret the results in an easily readable/understandable way until now. Due to the inefficiency to render top concepts using existing techniques, in this paper, we propose a probabilistic framework, which consists of language modeling (especially the topic models), Part-Of-Speech (POS) tags, query expansion, retrieval modeling and so on for the practical challenge. The rigorously empirical experiments demonstrate the consistent and utility performance of the proposed method on real datasets.