Workshop
Fri Dec 08 08:00 AM -- 06:30 PM (PST) @ Hyatt Hotel, Regency Ballroom D+E+F+H
Learning on Distributions, Functions, Graphs and Groups
Florence d'Alché-Buc · Krikamol Muandet · Bharath Sriperumbudur · Zoltán Szabó
The increased variability of acquired data has recently pushed the field of machine learning to extend its scope to non-standard data including for example functional (Ferraty & Vieu, 2006; Wang et al., 2015), distributional (Póczos et al., 2013), graph, or topological data (Carlsson, 2009; Vitaliy). Successful applications span across a wide range of disciplines such as healthcare (Zhou et al., 2013), action recognition from iPod/iPhone accelerometer data (Sun et al., 2013), causal inference (Lopez-Paz et al., 2015), bioinformatics (Kondor & Pan, 2016; Kusano et al., 2016), cosmology (Ravanbakhsh et al., 2016; Law et al., 2017), acoustic-to-articulatory speech inversion (Kadri et al., 2016), network inference (Brouard et al., 2016), climate research (Szabó et al., 2016), and ecological inference (Flaxman et al., 2015).
Leveraging the underlying structure of these non-standard data types often leads to significant boost in prediction accuracy and inference performance. In order to achieve these compelling improvements, however, numerous challenges and questions have to be addressed: (i) choosing an adequate representation of the data, (ii) constructing appropriate similarity measures (inner product, norm or metric) on these representations, (iii) efficiently exploiting their intrinsic structure such as multi-scale nature or invariances, (iv) designing affordable computational schemes (relying e.g., on surrogate losses), (v) understanding the computational-statistical tradeoffs of the resulting algorithms, and (vi) exploring novel application domains.
The goal of this workshop is
(i) to discuss new theoretical considerations and applications related to learning with non-standard data,
(ii) to explore future research directions by bringing together practitioners with various domain expertise and algorithmic tools, and theoreticians interested in providing sound methodology,
(iii) to accelerate the advances of this recent area and application arsenal.
We encourage submissions on a variety of topics, including but not limited to:
- Novel applications for learning on non-standard objects
- Learning theory/algorithms on distributions
- Topological and geometric data analysis
- Functional data analysis
- Multi-task learning, structured output prediction, and surrogate losses
- Vector-valued learning (e.g., operator-valued kernel)
- Gaussian processes
- Learning on graphs and networks
- Group theoretic methods and invariances in learning
- Learning with non-standard input/output data
- Large-scale approximations (e.g. sketching, random Fourier features, hashing, Nyström method, inducing point methods), and statistical-computational efficiency tradeoffs
References:
Frédéric Ferraty and Philippe Vieu. Nonparametric Functional Data Analysis: Theory and Practice. Springer Series in Statistics, Springer-Verlag, 2006.
Jane-Ling Wang, Jeng-Min Chiou, and Hans-Georg Müller. Review of Functional Data Analysis. Annual Review of Statistics, 3:1-41, 2015.
Barnabás Póczos, Aarti Singh, Alessandro Rinaldo, Larry Wasserman. Distribution-free Distribution Regression. International Conference on AI and Statistics (AISTATS), PMLR 31:507-515, 2013.
Gunnar Carlsson. Topology and data. Bulletin of the American Mathematical Society, 46 (2): 255-308, 2009.
Vitaliy Kurlin. Research blog: http://kurlin.org/blog/.
Jiayu Zhou, Jun Liu, Vaibhav A. Narayan, and Jieping Ye. Modeling disease progression via multi-task learning. NeuroImage, 78:233-248, 2013.
Xu Sun, Hisashi Kashima, and Naonori Ueda. Large-scale personalized human activity recognition using online multitask learning. IEEE Transactions on Knowledge and Data Engine, 25:2551-2563, 2013.
David Lopez-Paz, Krikamol Muandet, Bernhard Schölkopf, and Ilya Tolstikhin. Towards a Learning Theory of Cause-Effect Inference. International Conference on Machine Learning (ICML), PMLR 37:1452-1461, 2015.
Risi Kondor, Horace Pan. The Multiscale Laplacian Graph Kernel. Advances in Neural Information Processing Systems (NIPS), 2982-2990, 2016.
Genki Kusano, Yasuaki Hiraoka, Kenji Fukumizu. Persistence weighted Gaussian kernel for topological data analysis. International Conference on Machine Learning (ICML), PMLR 48:2004-2013, 2016.
Siamak Ravanbakhsh, Junier Oliva, Sebastian Fromenteau, Layne Price, Shirley Ho, Jeff Schneider, Barnabás Póczos. Estimating Cosmological Parameters from the Dark Matter Distribution. International Conference on Machine Learning (ICML), PMLR 48:2407-2416, 2016.
Ho Chung Leon Law, Dougal J. Sutherland, Dino Sejdinovic, Seth Flaxman. Bayesian Distribution Regression. Technical Report, 2017 (https://arxiv.org/abs/1705.04293).
Hachem Kadri, Emmanuel Duflos, Philippe Preux, Stéphane Canu, Alain Rakotomamonjy, and Julien Audiffren. Operator-valued kernels for learning from functional response data. Journal of Machine Learning Research, 17:1-54, 2016.
Céline Brouard, Marie Szafranski, and Florence d’Alché-Buc. Input output kernel regression: Supervised and semi-supervised structured output prediction with operator-valued kernels. Journal of Machine Learning Research, 17:1-48, 2016.
Zoltán Szabó, Bharath K. Sriperumbudur, Barnabás Póczos, Arthur Gretton. Learning Theory for Distribution Regression. Journal of Machine Learning Research, 17(152):1-40, 2016.
Seth Flaxman, Yu-Xiang Wang, and Alex Smola. Who supported Obama in 2012? Ecological inference through distribution regression. In ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), 289-298, 2015.