Framework and algorithms for trend analysis in massive temporal data sets

Sreenivas Gollapudi; D. Sivakumar

doi:10.1145/1031171.1031208

CIKM 2004

Conference paper

08 Nov 2004

Framework and algorithms for trend analysis in massive temporal data sets

View publication

Abstract

Mining massive temporal data streams for significant trends, emerging buzz, and unusually high or low activity is an important problem with several commercial applications. In this paper, we propose a framework based on relational records and metric spaces to study such problems. Our framework provides the necessary mathematical underpinnings for this genre of problems, and leads to efficient algorithms in the stream/sort model of massive data sets (where the algorithm makes passes over the data, computes a new stream on the fly, and is allowed to sort the intermediate data). Our algorithm makes novel use of metric approximations in the data stream context, and highlights the role of hierarchical organization of large data sets in designing efficient algorithms in the stream/sort model. 2004 ACM.

Paper