Modern Nonparametric Methods in Machine Learning

Workshop

Modern Nonparametric Methods in Machine Learning

Sivaraman Balakrishnan · Arthur Gretton · Mladen Kolar · John Lafferty · Han Liu · Tong Zhang

Sand Harbor 3, Harrah’s Special Events Center 2nd Floor

[ Abstract ] Workshop Website

The objective of this workshop is to bring together practitioners and theoreticians who are interested in developing scalable and principled nonparametric learning algorithms for analyzing complex and large-scale datasets. The workshop will communicate the newest research results and attack several important bottlenecks of nonparametric learning by exploring (i) new models and methods that enable high-dimensional nonparametric learning, (ii) new computational techniques that enable scalable nonparametric learning in online and parallel fashion, and (iii) new statistical theory that characterizes the performance and information-theoretic limits of nonparametric learning algorithms. The expected goals of this workshop include (i) reporting the state-of-the-art of modern nonparametrics, (ii) identifying major challenges and setting up the frontiers for nonparametric methods, (iii) connecting different disjoint communities in machine learning and statistics. The targeted application areas include genomics, cognitive neuroscience, climate science, astrophysics, and natural language processing.

Modern data acquisition routinely produces massive and complex datasets, including chip data from high throughput genomic experiments, image data from functional Magnetic Resonance Imaging (fMRI), proteomic data from tandem mass spectrometry analysis, and climate data from geographically distributed data centers. Existing high dimensional theories and learning algorithms rely heavily on parametric models, which assume the data come from an underlying distribution (e.g. Gaussian or linear models) that can be characterized by a finite number of parameters. If these assumptions are correct, accurate and precise estimates can be expected. However, given the increasing complexity of modern scientific datasets, conclusions inferred under these restrictive assumptions can be misleading. To handle this challenge, this workshop focuses on nonparametric methods, which directly conduct inference in infinite-dimensional spaces and thus are powerful enough to capture the subtleties in most modern applications.

We are targeting submissions in a variety of areas. Potential topics include, but are not limited to, the following areas where high dimensional nonparametric methods have found past success:
1. Nonparametric graphical models are a flexible way to model continuous distributions. For example, copulas can be used to separate the dependency structure between random variables from their marginal distributions (Liu et al. 2009). Fully nonparametric model of networks can be obtained using kernel density estimation and restricting the graphs to trees and forests (Liu et al. 2011).
2. Causal inference using kernel-based conditional independence testing is a nonparametric method, which improves a lot over previous approaches to estimate or test for conditional independence (Zhang et al. 2012).
3. Sparse additive models are used in many applications where linear regression models do not provide enough flexibility (Lin and Zhang, 2006), (Koltchinskii and Yuan, 2010), (Huang et al. 2010), (Ravikumar et al. 2009), (Meier et al. 2009).
4. Nonparametric methods are used to consistently estimate a large class of divergence measures, which have a wide range of applications (Poczos and Schneider, 2011).
5. Recently sparse matrix decompositions (Witten et al., 2009) were proposed as exploratory data analysis tools for high dimensional genomic data. Motivated by the need for additional modelling flexibility, sparse nonparametric generalizations of these matrix decompositions have been introduced (Balakrishnan et al., 2012).
6. Nonparametric learning promises flexibility, where flexible methods minimize assumptions such as linearity and Gaussianity that are often made only for convenience, or lack of alternatives. However, nonparametric estimation often comes with increased computational demands. To develop algorithms that are applicable on large-scale data, we need to take advantage of parallel computation. Promising parallel computing techniques include GPU programming, multi-core computing, and cloud computing.

References
[1] Francis R. Bach and Michael I. Jordan. Kernel independent component analysis. JMLR, 3:1–48, March 2003.
[2] Sivaraman Balakrishnan, Kriti Puniyani, and John Lafferty. Sparse additive functional and kernel CCA. ICML, 2012.
[3] Jian Huang, Joel L. Horowitz, and Fengrong Wei. Variable selection in nonparametric additive models. Ann. Statist., 2010.
[4] Vladimir Koltchinskii and Ming Yuan. Sparsity in multiple kernel learning. Ann. Statist., 2010.
[5] John Lafferty, Han Liu, and Larry Wasserman. Sparse nonparametric graphical models. arXiv:1201.0794v1, 2012.
[6] Yi Lin and Hao Helen Zhang. Component selection and smoothing in multivariate nonparametric regression. Ann. Statist., 2006.
[7] Han Liu, John D. Lafferty, and Larry A. Wasserman. The nonparanormal: Semiparametric estimation of high dimensional undirected graphs. JMLR, 2009.
[8] Han Liu, Min Xu, Haijie Gu, Anupam Gupta, John D. Lafferty, and Larry A. Wasserman. Forest density estimation. JMLR, 2011.
[9] Lukas Meier, Sara van de Geer, and Peter Bu ̈hlmann. High-dimensional additive modeling. Ann. Statist., 2009.
[10] B. Poczos and J. Schneider. Nonparametric estimation of conditional information and divergences. AISTATS, 2012.
[11] Garvesh Raskutti, Martin Wainwright, and Bin Yu. Minimax-optimal rates for sparse additive models over kernel classes via convex programming. JMLR,2010.
[12] Pradeep Ravikumar, John Lafferty, Han Liu, and Larry Wasserman. Sparse additive models. JRSSB (Statistical Methodology), 2009.
[13] B. Scholkopf, A. Smola, and K.R. Muller. Nonlinear component analysis as a kernel eigenvalue problem. Neural computation, 1998.
[14] Kun Zhang, Jonas Peters, Dominik Janzing, and Bernhard Scholkopf. Kernel-based conditional independence test and application in causal discovery. CoRR, abs/1202.3775, 2012.
[15] Daniela M. Witten, Robert Tibshirani, and Trevor Hastie. A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis. Biostatistics, 2009.

Website: https://sites.google.com/site/nips2012modernnonparametric/

Live content is unavailable. Log in and register to view live content