Reliable and Interpretable Drift Detection in Streams of Short Texts
Abstract
Data drift is the change in model input data that is one of the key factors that lead to machine learning models performance degradation over time. Monitoring drift helps detecting these issues and preventing their harmful consequences. Meaningful drift interpretation is a fundamental step towards effective re-training of the model. In this study we propose an end-to-end framework for reliable model-agnostic change-point detection and interpretation in large task-oriented dialog systems, proven effective in multiple customer deployments. We evaluate our approach and demonstrate its benefits with a novel, carefully curated dataset, simulating customer requests to a dialog system. We make the data publicly available for the research community.