About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
SSDBM 2017
Conference paper
Data series similarity using correlation-aware measures
Abstract
The increased availability of unprecedented amounts of sequential data (generated by Internet-of-Things, as well as scientific applications) has led in the past few years to a renewed interest and attention to the field of data series processing and analysis. Data series collections are processed and analyzed using a large variety of techniques, most of which are based on the computation of some distance function. In this study, we revisit this basic operation of data series distance calculation. We observe that the popular distance measures are oblivious to the correlations inherent in neighboring values in a data series. Therefore, we evaluate the plausibility and benefit of incorporating into the distance function measures of correlation, which enable us to capture the associations among neighboring values in the sequence. We propose four such measures, inspired by statistical and probabilistic approaches, which can effectively model these correlations. We analytically and experimentally demonstrate the benefits of the new measures using the 1NN classification task, and discuss the lessons learned. Finally, we propose future research directions for enabling the proposed measures to be used in practice.