About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
SODA 2010
Conference paper
1-Pass relative-error Lp-sampling with applications
Abstract
For any p ∈ [0, 2], we give a 1-pass poly(ε-1 log n)-space algorithm which, given a data stream of length m with insertions and deletions of an n-dimensional vector a, with updates in the range {-M, -M + 1, ⋯ , M - 1, M}, outputs a sample of [n] = {1,2, ⋯ , n} for which for all i the probability that i is returned is (1 ± ∈) |a i|p/Fp(a) ± n-C, where a i denotes the (possibly negative) value of coordinate i, F p(a) = Σi=1n |ai|p = ∥a∥pp denotes the p-th frequency moment (i.e., the p-th power of the Lp norm), and C > 0 is an arbitrarily large constant. Here we assume that n, m, and M are polynomially related. Our generic sampling framework improves and unifies algorithms for several communication and streaming problems, including cascaded norms, heavy hitters, and moment estimation. It also gives the first relative-error forward sampling algorithm in a data stream with deletions, answering an open question of Cormode et al. Copyright © by SIAM.