IEEE Transactions on Knowledge and Data Engineering

Streaming time series summarization using user-defined amnesic functions

View publication


The past decade has seen a wealth of research on time series representations, because the manipulation, storage, and indexing of large volumes of raw time series data is impractical. The vast majority of research has concentrated on representations that are calculated in batch mode and represent each value with approximately equal fidelity. However, the Increasing deployment of mobile devices and real-time sensors has brought home the need for representations that can be incrementally updated and can approximate the data with fidelity proportional to its age. The latter property allows us to answer queries about the recent past with greater precision, since in many domains, recent information is more useful than older Information. We call such representations amnesic. While there has been previous work on amnesic representations, the class of amnesic functions possible was dictated by the representation itself. In this work, we introduce a novel representation of time series that can represent arbitrary user-specified amnesic functions. For example, a meteorologist may decide that data that is twice as old can tolerate twice as much error and thus specify a linear amnesic function. In contrast, an econometrist might opt for an exponential amnesic function. We propose online algorithms for our representation and discuss their properties. Finally, we perform an extensive empirical evaluation on 40 data sets and show that our approach can efficiently maintain a high-quality amnesic approximation. © 2008 IEEE.