Timezone: »
Oral
Fast and Accurate Least-Mean-Squares Solvers
Ibrahim Jubran · Alaa Maalouf · Dan Feldman
Least-mean squares (LMS) solvers such as Linear / Ridge / Lasso-Regression, SVD and Elastic-Net not only solve fundamental machine learning problems, but are also the building blocks in a variety of other methods, such as decision trees and matrix factorizations.
We suggest an algorithm that gets a finite set of $n$ $d$-dimensional real vectors and returns a weighted subset of $d+1$ vectors whose sum is \emph{exactly} the same. The proof in Caratheodory's Theorem (1907) computes such a subset in $O(n^2d^2)$ time and thus not used in practice. Our algorithm computes this subset in $O(nd)$ time, using $O(\log n)$ calls to Caratheodory's construction on small but "smart" subsets. This is based on a novel paradigm of fusion between different data summarization techniques, known as sketches and coresets.
As an example application, we show how it can be used to boost the performance of existing LMS solvers, such as those in scikit-learn library, up to x100. Generalization for streaming and distributed (big) data is trivial.
Extensive experimental results and complete open source code are also provided.
Author Information
Ibrahim Jubran (The University of Haifa)
Alaa Maalouf (The University of Haifa)
Dan Feldman (University of Haifa)
Related Events (a corresponding poster, oral, or spotlight)
-
2019 Poster: Fast and Accurate Least-Mean-Squares Solvers »
Wed Dec 11th 06:45 -- 08:45 PM Room East Exhibition Hall B + C
More from the Same Authors
-
2020 Poster: Coresets for Near-Convex Functions »
Murad Tukan · Alaa Maalouf · Dan Feldman -
2019 Poster: k-Means Clustering of Lines for Big Data »
Yair Marom · Dan Feldman -
2016 Poster: Dimensionality Reduction of Massive Sparse Datasets Using Coresets »
Dan Feldman · Mikhail Volkov · Daniela Rus