Input-Sparsity Low Rank Approximation in Schatten Norm

Part of Proceedings of the International Conference on Machine Learning 1 pre-proceedings (ICML 2020)

Bibtex »Metadata »Paper »Supplemental »

Bibtek download is not availble in the pre-proceeding


Authors

Yi Li, David Woodruff

Abstract

We give the first input-sparsity time algorithms for the rank-$k$ low rank approximation problem in every Schatten norm. Specifically, for a given $n\times n$ matrix $A$, our algorithm computes $Y,Z\in \R^{n\times k}$, which, with high probability, satisfy $\|A-YZ^T\|_p \leq (1+\eps)\|A-A_k\|_p$, where $\|M\|_p = \left (\sum_{i=1}^n \sigma_i(M)^p \right )^{1/p}$ is the Schatten $p$-norm of a matrix $M$ with singular values $\sigma_1(M), \ldots, \sigma_n(M)$, and where $A_k$ is the best rank-$k$ approximation to $A$. Our algorithm runs in time $\tilde{O}(\nnz(A) + n^{\alpha_p}\poly(k/\eps))$, where $\alpha_p = 1$ for $p\in [1,2)$ and $\alpha_p = 1 + (\omega-1)(1-2/p)$ for $p>2$ and $\omega \approx 2.374$ is the exponent of matrix multiplication. For the important case of $p = 1$, which corresponds to the more ``robust'' nuclear norm, we obtain $\tilde{O}(\nnz(A) + n \cdot \poly(k/\epsilon))$ time, which was previously only known for the Frobenius norm $(p = 2)$. Moreover, since $\alpha_p < \omega$ for every $p$, our algorithm has a better dependence on $n$ than that in the singular value decomposition for every $p$. Crucial to our analysis is the use of dimensionality reduction for Ky-Fan $p$-norms.