Numerical linear algebra in the streaming model
Abstract
We give near-optimal space bounds in the streaming model for linear algebra problems that include estimation of matrix products, linear regression, low-rank approximation, and approximation of matrix rank. In the streaming model, sketches of input matrices are maintained under updates of matrix entries; we prove results for turnstile updates, given in an arbitrary order. We give the first lower bounds known for the space needed by the sketches, for a given estimation error e. We sharpen prior upper bounds, with respect to combinations of space, failure probability, and number of passes. The sketch we use for matrix A is simply STA, where S is a sign matrix. Our results include the following upper and lower bounds on the bits of space needed for 1-pass algorithms. Here A is an n × d matrix, B is an n × d' matrix, and c := d +d'. These results are given for fixed failure probability; for failure probability δ > 0, the upper bounds require a factor of log(1/δ) more space. We assume the inputs have integer entries specified by O(log(nc)) bits, or O(log(nd)) bits. 1. (Matrix Product) Output matrix C with ||A TB - C|| ≤ e||A||||B||. We show that Θ(ce -2 log(nc)) space is needed. 2. (Linear Regression) For d' = 1, so that B is a vectorb, find x so that ||Ax - b|| le; (1 + e) min x'eIRd ||Ax' - b||. We show that Θ(d 2e -1 log(nd)) space is needed. 3. (Rank-k Approximation) Find matrix A ̃k of rank no more than k, so that||A - A ̃k ≤ (1 + e)||A - A k||, where A k is the best rank-k approximation to A. Our lower bound is Ω(ke -1(n + d) log(nd)) space, and we give a one-pass algorithm matching this when A is given row-wise or column-wise. For general updates, we give a one-pass algorithm needing O(ke -2(n + d/e 2) log(nd)) space. We also give upper and lower bounds for algorithms using multiple passes, and a sketching analog of the CUR decomposition. Copyright 2009 ACM.