Publication
SIGMOD/PODS 2012
Conference paper

Space-efficient estimation of statistics over sub-sampled streams

View publication

Abstract

In many stream monitoring situations, the data arrival rate is so high that it is not even possible to observe each element of the stream. The most common solution is to sample a small fraction of the data stream and use the sample to infer properties and estimate aggregates of the original stream. However, the quantities that need to be computed on the sampled stream are often different from the original quantities of interest and their estimation requires new algorithms. We present upper and lower bounds (often matching) for estimating frequency moments, support size, entropy, and heavy hitters of the original stream from the data observed in the sampled stream. © 2012 ACM.

Date

26 Jun 2012

Publication

SIGMOD/PODS 2012

Authors

Share