Poster
Towards a Zero-One Law for Column Subset Selection
Zhao Song · David Woodruff · Peilin Zhong
East Exhibition Hall B, C #47
Keywords: [ Theory ] [ Learning Theory ] [ Algorithms; Algorithms -> Components Analysis (e.g., CCA, ICA, LDA, PCA); Theory ]
[
Abstract
]
Abstract:
There are a number of approximation algorithms for NP-hard versions of low rank approximation, such as finding a rank-k matrix B minimizing the sum of absolute values of differences to a given n-by-n matrix A, minrank-k B‖A−B‖1, or more generally finding a rank-k matrix B which minimizes the sum of p-th powers of absolute values of differences, minrank-k B‖A−B‖pp. Many of these algorithms are linear time columns subset selection algorithms,
returning a subset of \poly(klogn) columns whose cost is no more than a \poly(k) factor larger than the cost of the best rank-k matrix.
The above error measures are special cases of the following general entrywise
low rank approximation problem: given an arbitrary function g:R→R≥0, find a rank-k matrix B which minimizes ‖A−B‖g=∑i,jg(Ai,j−Bi,j). A natural question is which functions g admit efficient approximation algorithms? Indeed, this is a central question of recent work studying generalized low rank models. In this work we give approximation algorithms for {\it every} function g which is approximately monotone and satisfies an approximate triangle inequality, and we show both of these conditions are necessary. Further, our algorithm is efficient if the function g admits an efficient approximate regression algorithm. Our approximation algorithms handle functions which are not even scale-invariant, such as the Huber loss function, which we show have very different structural properties than ℓp-norms, e.g., one can show the lack of scale-invariance causes any column subset selection algorithm to provably require a √logn factor larger number of columns than ℓp-norms; nevertheless we design the first efficient column subset selection algorithms for such error measures.
Live content is unavailable. Log in and register to view live content