Optimal principal component analysis in distributed and streaming models

Christos Boutsidis; David P. Woodruff; Peilin Zhong

doi:10.1145/2897518.2897646

STOC 2016

Conference paper

19 Jun 2016

Optimal principal component analysis in distributed and streaming models

View publication

Abstract

This paper studies the Principal Component Analysis (PCA) problem in the distributed and streaming models of computation. Given a matrix A ∈ ℝm×n, a rank parameter k < rank(A), and an accuracy parameter 0 < ϵ < 1, we want to output an m × k orthonormal matrix U for which (Equation Presented) where Aκ ∈ ℝm×n is the best rank-k approximation to A. Our contributions are summarized as follows: 1. In the arbitrary partition distributed model of Kannan et al. (COLT 2014), each of s machines holds a matrix A1 and A = Σsi=1 Ai. Each machine should output U. Kannan et al. achieve O(skm/ϵ) + poly(sk/ϵ) words (of O(log(nm)) bits) communication. We obtain the improved bound of O (skm)+ poly(sk/ϵ) words, and show an optimal (up to low order terms) Ω(skm) lower bound. This resolves an open question in the literature. A poly(ϵ-1) dependence is known to be required, but we separate this dependence from m. 2. In a more specific distributed model where each server receives a subset of columns of A, we bypass the above lower bound when A is φ-sparse in each column. Here we obtain an O(skφ/ϵ) + poly(sk/ϵ) word protocol. Our communication is independent of the matrix dimensions, and achieves the guarantee that each server, in addition to outputting U, outputs a subset of O(k/ϵ) columns of A containing a U in its span (that is, for the first time, we solve distributed column subset selection). Additionally, we show a matching Ω(skφ/ϵ) lower bound for distributed column subset selection. Achieving our communication bound when A is sparse in general but not sparse in each column, is impossible. 3. In the streaming model of computation, in which the columns of the matrix A arrive one at a time, an algorithm of Liberty (KDD, 2013) with an improved analysis by Ghashami and Phillips (SODA, 2014) achieves O(km/ϵ) "real numbers" space complexity. We improve this result, since our one-pass streaming PCA algorithm achieves an O(km/ϵ) + poly(k/ϵ) word space upper bound. This almost matches a known Ω(km/ϵ) bit lower bound of Woodruff (NIPS, 2014). We show that with two passes over the columns of A one can achieve an O(km) + poly(k/ϵ) word space upper bound; another lower bound of Woodruff (NIPS, 2014) shows that this is optimal for any constant number of passes (up to the poly (k/ϵ) term and the distinction between words versus bits). 4. Finally, in turnstile streams, in which we receive entries of A one at a time in an arbitrary order, we describe an algorithm with O((m + n)kϵ-1) words of space. This improves the O((m + n)ϵ-2)kϵ-2) upper bound of Clarkson and Woodruff (STOC 2009), and matches their Ω((m + n)kϵ-1) word lower bound. Notably, our results do not depend on the condition number or any singular value gaps of A.

Conference paper