Finding low-rank solutions via nonconvex matrix factorization, efficiently and provably
Abstract
A rank-r matrix X ∈ Rm×ncan be written as a product UV⊤, where U ∈ Rm×rand V ∈ Rn×r. One could exploit this observation in optimization: e.g., consider the minimization of a convex function f(X) over rank-r matrices, where the set of low-rank matrices is modeled via UV⊤. Though such parameterization reduces the number of variables and is more computationally efficient (of particular interest is the case r ≪ min{m, n}), it comes at a cost: f(UV⊤) becomes a nonconvex function w.r.t. U and V. We study such parameterization on generic convex objectives f and focus on first-order, gradient descent algorithms. We propose the bifactored gradient descent (BFGD) algorithm, an efficient first-order method that operates directly on the U, V factors. We show that when f is (restricted) smooth, BFGD has local sublinear convergence; when f is both (restricted) smooth and (restricted) strongly convex, it has local linear convergence. For several applications, we provide simple and efficient initialization schemes that provide initial conditions, good enough for the above convergence results to hold, globally. Extensive experimental results support our arguments that BFGD is an efficient and accurate nonconvex method, compared to state-of-the-art approaches.