Out of the Box: Robustness in High Dimension

Workshop

Out of the Box: Robustness in High Dimension

Aurelie Lozano · Aleksandr Y Aravkin · Stephen Becker

Level 5; room 510 b

[ Abstract ] Workshop Website

The technical term “robust” was coined in 1953 by G. E. P. Box and exemplifies his adage, “all models are wrong, but some are useful”. Over the past decade, a broad range of new paradigms have appeared that allow useful inference when standard modeling assumptions are violated. Classic examples include heavy tailed formulations that mitigate the effect of outliers which would otherwise degrade the performance of Gaussian-based methods.

High-dimensional data are becoming ubiquitous in diverse domains such as genomics, neuroimaging, economics, and finance. Such data exacerbate the relevance of robustness as errors and model misspecification are prevalent in such modern applications. To extract pertinent information from large scale data, robust formulations require a comprehensive understanding of machine learning, optimization, and statistical signal processing, thereby integrating recovery guarantees, statistical and computational efficiency, algorithm design and scaling issues. For example, robust Principal Component Analysis (RPCA) can be approached using both convex and nonconvex formulations, giving rise to tradeoffs between computational efficiency and theoretical guarantees.

The goal of this workshop is to bring together machine learning, high-dimensional statistics, optimization and select large-scale applications, in order to investigate the interplay between robust modeling and computation in the large-scale setting. We incorporate several important examples that are strongly linked by this theme:

(a) Low rank matrix recovery, robust PCA, and robust dictionary learning: High-dimensional data problems where the number of variables may greatly exceed the number of observations can be accurately solved by leveraging low-dimensional structural constraints upon the parameters to be estimated. For matrix-structured parameters, low-rank recovery is a prime example of such low-dimensional assumption. To efficiently recover the low-rank structure characterizing the data, Robust PCA extends classical PCA in order to accommodate grossly corrupted observations that have become ubiquitous in modern applications. Sparse coding and dictionary learning build upon the fact that many real-world data can be represented as a sparse linear combination of basis vectors over an over-complete dictionary and aims at learning such an efficient representation of the data. Sparse coding and dictionary learning are being used in a variety of tasks including image denoising and inpainting, texture synthesis, image classification and unusual event detection.

(b) Robust inference for large scale inverse problems and machine learning: Many data commonly encountered are heavy-tailed where the Gaussian assumption does not apply. The issue of robustness has been largely overlooked in the high-dimensional learning literature, yet this aspect is critical when dealing with high dimensional noisy data. Traditional likelihood-based estimators (including Lasso and Group Lasso) are known to lack resilience to outliers and model misspecification. Despite this fact, there has been limited focus on robust learning methods in high-dimensional modeling.

(c) Non-convex formulations: heavy tails, factorized matrix inversion, nonlinear forward models. Combining robustness with statistical efficiency requires non-convexity of the loss function. Surprisingly, it is often possible to show that either certain non-convex problems have exact convex relaxations, or that algorithms directly solving non-convex problems may produce points that are statistically indistinguishable from the global optimum.

(d) Robust optimization: avoiding overfitting on precise but unreliable parameters. This classic topic has become increasingly relevant as researchers purposefully perturb problems. This perturbation comes in many forms: from “sketching” functions with Johnson-Lindenstrauss-like transformations, using randomized algorithms to speed up linear algebra, using randomized coordinate descent, and/or stochastic gradient algorithms. Recently the techniques of robust optimization have been applied to these situations.

It is the aim of this workshop to bring together researchers from statistics, machine learning, optimization, and applications, in order to focus on a comprehensive understanding of robust modeling and computation. In particular, we will see challenges of implementing robust formulations in the large-scale and nonconvex setting, as well as examples of success in these areas.

The workshop follows in the footsteps if the “Robust ML” workshop at NIPS in 2010. The field is very active and there have been significant advances in the past 4 years. We also expect to have new topics, such as new applications of robust optimization to user-perturbed problems and Markov Decision Processes.

Live content is unavailable. Log in and register to view live content