# Analysis of stochastic Lanczos quadrature for spectrum approximation

## Abstract

The cumulative empirical spectral measure (CESM) ( \Phi(\vec{A}) : \mathbb{R} \to [0,1] ) of a ( n\times n ) symmetric matrix ( \vec{A} ) is defined as the fraction of eigenvalues of ( \vec{A} ) less than a given threshold, i.e., ( \Phi(\vec{A})(x) := \sum{i=1}^{n} \frac{1}{n} 1[ \lambdai (\vec{A})\leq x] ). Spectral sums ( \tr(f(\vec{A})) ) can be computed as the Riemann--Stieltjes integral of ( f ) against ( \Phi(\vec{A}) ), so the problem of estimating CESM arises frequently in a number of applications, including machine learning. We present an error analysis for stochastic Lanczos quadrature (SLQ). We show that SLQ obtains an approximation to the CESM within a Wasserstein distance of ( t \: | \lambda{\textup{max}}(\vec{A}) - \lambda{\textup{min}}(\vec{A}) | ) with probability at least ( 1-\eta ), by applying the Lanczos algorithm for ( \lceil 6 t^{-1} \rceil ) iterations to ( \lceil 4 ( n+2 )^{-1}t^{-2} \ln(2n\eta^{-1}) \rceil ) vectors sampled independently and uniformly from the unit sphere. We additionally provide (matrix-dependent) a posteriori error bounds for the Wasserstein and Kolmogorov--Smirnov distances between the output of this algorithm and the true CESM. The quality of our bounds is demonstrated using numerical experiments.