Timezone: »
Spotlight
Quantum Entropy Scoring for Fast Robust Mean Estimation and Improved Outlier Detection
Yihe Dong · Samuel Hopkins · Jerry Li
We study two problems in high-dimensional robust statistics: \emph{robust mean estimation} and \emph{outlier detection}.
In robust mean estimation the goal is to estimate the mean $\mu$ of a distribution on $\mathbb{R}^d$ given $n$ independent samples, an $\epsilon$-fraction of which have been corrupted by a malicious adversary.
In outlier detection the goal is to assign an \emph{outlier score} to each element of a data set such that elements more likely to be outliers are assigned higher scores.
Our algorithms for both problems are based on a new outlier scoring method we call QUE-scoring based on \emph{quantum entropy regularization}.
For robust mean estimation, this yields the first algorithm with optimal error rates and nearly-linear running time $\tilde{O}(nd)$ in all parameters, improving on the previous fastest running time $\tilde{O}(\min(nd/\e^6, nd^2))$.
For outlier detection, we evaluate the performance of QUE-scoring via extensive experiments on synthetic and real data, and demonstrate that it often performs better than previously proposed algorithms.
Author Information
Yihe Dong (Microsoft)
Machine learning researcher and engineer with experience in NLP, statistical ML, and information retrieval.
Sam Hopkins (UC Berkeley)
Jerry Li (Microsoft)
Related Events (a corresponding poster, oral, or spotlight)
-
2019 Poster: Quantum Entropy Scoring for Fast Robust Mean Estimation and Improved Outlier Detection »
Tue Dec 10th 06:45 -- 08:45 PM Room East Exhibition Hall B + C
More from the Same Authors
-
2020 Poster: COPT: Coordinated Optimal Transport on Graphs »
Yihe Dong · Will Sawin -
2020 Poster: Estimating Rank-One Spikes from Heavy-Tailed Noise via Self-Avoiding Walks »
Jingqiu Ding · Samuel Hopkins · David Steurer -
2020 Spotlight: Estimating Rank-One Spikes from Heavy-Tailed Noise via Self-Avoiding Walks »
Jingqiu Ding · Samuel Hopkins · David Steurer -
2020 Poster: Robust Gaussian Covariance Estimation in Nearly-Matrix Multiplication Time »
Jerry Li · Guanghao Ye -
2020 Poster: Robust Sub-Gaussian Principal Component Analysis and Width-Independent Schatten Packing »
Arun Jambulapati · Jerry Li · Kevin Tian -
2020 Spotlight: Robust Sub-Gaussian Principal Component Analysis and Width-Independent Schatten Packing »
Arun Jambulapati · Jerry Li · Kevin Tian -
2020 Poster: CoinPress: Practical Private Mean and Covariance Estimation »
Sourav Biswas · Yihe Dong · Gautam Kamath · Jonathan Ullman -
2020 Poster: Robust and Heavy-Tailed Mean Estimation Made Simple, via Regret Minimization »
Sam Hopkins · Jerry Li · Fred Zhang -
2020 Poster: Learning Structured Distributions From Untrusted Batches: Faster and Simpler »
Sitan Chen · Jerry Li · Ankur Moitra -
2019 Poster: Provably Robust Deep Learning via Adversarially Trained Smoothed Classifiers »
Hadi Salman · Jerry Li · Ilya Razenshteyn · Pengchuan Zhang · Huan Zhang · Sebastien Bubeck · Greg Yang -
2019 Spotlight: Provably Robust Deep Learning via Adversarially Trained Smoothed Classifiers »
Hadi Salman · Jerry Li · Ilya Razenshteyn · Pengchuan Zhang · Huan Zhang · Sebastien Bubeck · Greg Yang