About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
Big Data 2014
Conference paper
TRISTAN: Real-time analytics on massive time series using sparse dictionary compression
Abstract
Large-scale critical infrastructures such as transportation, energy, or water distribution networks are increasingly equipped with smart sensor technologies. Low-latency analytics on the resulting times series would open the door to many exciting opportunities to improve our grasp on complex urban systems. However, sensor-generated time series often turn out to be noisy, non-uniformly sampled, and misaligned in practice, making them ill-suited for traditional data processing. In this paper, we introduce TRISTAN (massive TRIckletS Time series ANalysis), a new data management system for efficient storage and real-time processing of fine-grained time series data. TRISTAN relies on a dedicated, compressed sparse representation of the time series using a dictionary. In contrast to previous approaches, TRISTAN is able to execute most analytics queries on the compressed data directly, and supports efficient and approximate query answering based on the most significant atoms of the dictionary only. We present the overall architecture of our system and discuss its performance on several smarter city datasets, showing that TRISTAN can achieve up to 20:1 compression ratios and 250x speedup compared to a state-of-the-art system.