Timezone: »
Poster
Coresets for NearConvex Functions
Murad Tukan · Alaa Maalouf · Dan Feldman
Coreset is usually a small weighted subset of $n$ input points in $\mathbb{R}^d$, that provably approximates their loss function for a given set of queries (models, classifiers, etc.). Coresets become increasingly common in machine learning since existing heuristics or inefficient algorithms may be improved by running them possibly many times on the small coreset that can be maintained for streaming distributed data. Coresets can be obtained by sensitivity (importance) sampling, where its size is proportional to the total sum of sensitivities. Unfortunately, computing the sensitivity of each point is problem dependent and may be harder to compute than the original optimization problem at hand. We suggest a generic framework for computing sensitivities (and thus coresets) for wide family of loss functions which we call nearconvex functions. This is by suggesting the $f$SVD factorization that generalizes the SVD factorization of matrices to functions. Example applications include coresets that are either new or significantly improves previous results, such as SVM, Logistic regression, Mestimators, and $\ell_z$regression. Experimental results and open source are also provided.
Author Information
Murad Tukan (University of Haifa)
Alaa Maalouf (The University of Haifa)
Dan Feldman (University of Haifa)
More from the Same Authors

2021 Spotlight: Coresets for Decision Trees of Signals »
Ibrahim Jubran · Ernesto Evgeniy Sanches Shayda · Ilan I Newman · Dan Feldman 
2021 Poster: Compressing Neural Networks: Towards Determining the Optimal Layerwise Decomposition »
Lucas Liebenwein · Alaa Maalouf · Dan Feldman · Daniela Rus 
2021 Poster: Coresets for Decision Trees of Signals »
Ibrahim Jubran · Ernesto Evgeniy Sanches Shayda · Ilan I Newman · Dan Feldman 
2019 Poster: Fast and Accurate LeastMeanSquares Solvers »
Ibrahim Jubran · Alaa Maalouf · Dan Feldman 
2019 Oral: Fast and Accurate LeastMeanSquares Solvers »
Ibrahim Jubran · Alaa Maalouf · Dan Feldman 
2019 Poster: kMeans Clustering of Lines for Big Data »
Yair Marom · Dan Feldman 
2016 Poster: Dimensionality Reduction of Massive Sparse Datasets Using Coresets »
Dan Feldman · Mikhail Volkov · Daniela Rus