Panning requirement nuggets in stream of software maintenance tickets
Abstract
There is an increasing trend to outsource maintenance of large applications and application portfolios of a business to third parties, specialising in application maintenance, who are incented to deliver the best possible maintenance at the lowest cost. To do so, they need to identify repeat problem areas, which cause more maintenance grief, and seek a unified remedy to avoid the costs spent on fixing these individually. These repeat areas, in a sense, represent major, evolving areas of need, or requirements, for the customer. The information about the repeating problem is typically embedded in the unstructured text of multiple tickets, waiting to be found and addressed. Currently, repeat problems are found by manual analysis; effective solutions depend on the collective experience of the team solving them. In this paper, we propose an approach to automatically analyze problem tickets to discover groups of problems being reported in them and provide meaningful, descriptive labels to help interpret these groups. Our approach incorporates a cleansing phase to handle the high level of noise observed in problem tickets and a method to incorporate multiple text clustering techniques and merge their results in a meaningful manner. We provide detailed experiments to quantitatively and qualitatively evaluate our approach.