Timezone: »
The increased variability of acquired data has recently pushed the field of machine learning to extend its scope to non-standard data including for example functional (Ferraty & Vieu, 2006; Wang et al., 2015), distributional (Póczos et al., 2013), graph, or topological data (Carlsson, 2009; Vitaliy). Successful applications span across a wide range of disciplines such as healthcare (Zhou et al., 2013), action recognition from iPod/iPhone accelerometer data (Sun et al., 2013), causal inference (Lopez-Paz et al., 2015), bioinformatics (Kondor & Pan, 2016; Kusano et al., 2016), cosmology (Ravanbakhsh et al., 2016; Law et al., 2017), acoustic-to-articulatory speech inversion (Kadri et al., 2016), network inference (Brouard et al., 2016), climate research (Szabó et al., 2016), and ecological inference (Flaxman et al., 2015).
Leveraging the underlying structure of these non-standard data types often leads to significant boost in prediction accuracy and inference performance. In order to achieve these compelling improvements, however, numerous challenges and questions have to be addressed: (i) choosing an adequate representation of the data, (ii) constructing appropriate similarity measures (inner product, norm or metric) on these representations, (iii) efficiently exploiting their intrinsic structure such as multi-scale nature or invariances, (iv) designing affordable computational schemes (relying e.g., on surrogate losses), (v) understanding the computational-statistical tradeoffs of the resulting algorithms, and (vi) exploring novel application domains.
The goal of this workshop is
(i) to discuss new theoretical considerations and applications related to learning with non-standard data,
(ii) to explore future research directions by bringing together practitioners with various domain expertise and algorithmic tools, and theoreticians interested in providing sound methodology,
(iii) to accelerate the advances of this recent area and application arsenal.
We encourage submissions on a variety of topics, including but not limited to:
- Novel applications for learning on non-standard objects
- Learning theory/algorithms on distributions
- Topological and geometric data analysis
- Functional data analysis
- Multi-task learning, structured output prediction, and surrogate losses
- Vector-valued learning (e.g., operator-valued kernel)
- Gaussian processes
- Learning on graphs and networks
- Group theoretic methods and invariances in learning
- Learning with non-standard input/output data
- Large-scale approximations (e.g. sketching, random Fourier features, hashing, Nyström method, inducing point methods), and statistical-computational efficiency tradeoffs
References:
Frédéric Ferraty and Philippe Vieu. Nonparametric Functional Data Analysis: Theory and Practice. Springer Series in Statistics, Springer-Verlag, 2006.
Jane-Ling Wang, Jeng-Min Chiou, and Hans-Georg Müller. Review of Functional Data Analysis. Annual Review of Statistics, 3:1-41, 2015.
Barnabás Póczos, Aarti Singh, Alessandro Rinaldo, Larry Wasserman. Distribution-free Distribution Regression. International Conference on AI and Statistics (AISTATS), PMLR 31:507-515, 2013.
Gunnar Carlsson. Topology and data. Bulletin of the American Mathematical Society, 46 (2): 255-308, 2009.
Vitaliy Kurlin. Research blog: http://kurlin.org/blog/.
Jiayu Zhou, Jun Liu, Vaibhav A. Narayan, and Jieping Ye. Modeling disease progression via multi-task learning. NeuroImage, 78:233-248, 2013.
Xu Sun, Hisashi Kashima, and Naonori Ueda. Large-scale personalized human activity recognition using online multitask learning. IEEE Transactions on Knowledge and Data Engine, 25:2551-2563, 2013.
David Lopez-Paz, Krikamol Muandet, Bernhard Schölkopf, and Ilya Tolstikhin. Towards a Learning Theory of Cause-Effect Inference. International Conference on Machine Learning (ICML), PMLR 37:1452-1461, 2015.
Risi Kondor, Horace Pan. The Multiscale Laplacian Graph Kernel. Advances in Neural Information Processing Systems (NIPS), 2982-2990, 2016.
Genki Kusano, Yasuaki Hiraoka, Kenji Fukumizu. Persistence weighted Gaussian kernel for topological data analysis. International Conference on Machine Learning (ICML), PMLR 48:2004-2013, 2016.
Siamak Ravanbakhsh, Junier Oliva, Sebastian Fromenteau, Layne Price, Shirley Ho, Jeff Schneider, Barnabás Póczos. Estimating Cosmological Parameters from the Dark Matter Distribution. International Conference on Machine Learning (ICML), PMLR 48:2407-2416, 2016.
Ho Chung Leon Law, Dougal J. Sutherland, Dino Sejdinovic, Seth Flaxman. Bayesian Distribution Regression. Technical Report, 2017 (https://arxiv.org/abs/1705.04293).
Hachem Kadri, Emmanuel Duflos, Philippe Preux, Stéphane Canu, Alain Rakotomamonjy, and Julien Audiffren. Operator-valued kernels for learning from functional response data. Journal of Machine Learning Research, 17:1-54, 2016.
Céline Brouard, Marie Szafranski, and Florence d’Alché-Buc. Input output kernel regression: Supervised and semi-supervised structured output prediction with operator-valued kernels. Journal of Machine Learning Research, 17:1-48, 2016.
Zoltán Szabó, Bharath K. Sriperumbudur, Barnabás Póczos, Arthur Gretton. Learning Theory for Distribution Regression. Journal of Machine Learning Research, 17(152):1-40, 2016.
Seth Flaxman, Yu-Xiang Wang, and Alex Smola. Who supported Obama in 2012? Ecological inference through distribution regression. In ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), 289-298, 2015.
Fri 9:00 a.m. - 9:30 a.m.
|
On Structured Prediction Theory with Calibrated Convex Surrogate Losses.
(
Talk
)
link »
We provide novel theoretical insights on structured prediction in the context of efficient convex surrogate loss minimization with consistency guarantees. For any task loss, we construct a convex surrogate that can be optimized via stochastic gradient descent and we prove tight bounds on the so-called "calibration function" relating the excess surrogate risk to the actual risk. In contrast to prior related work, we carefully monitor the effect of the exponential number of classes in the learning guarantees as well as on the optimization complexity. As an interesting consequence, we formalize the intuition that some task losses make learning harder than others, and that the classical 0-1 loss is ill-suited for general structured prediction. This (https://arxiv.org/abs/1703.02403) is joint work with Anton Osokin and Francis Bach. |
Simon Lacoste-Julien 🔗 |
Fri 9:30 a.m. - 9:50 a.m.
|
Differentially Private Database Release via Kernel Mean Embeddings.
(
Contributed Talk
)
link »
Authors: Matej Balog, Ilya Tolstikhin, Bernhard Schölkopf. |
🔗 |
Fri 9:50 a.m. - 10:10 a.m.
|
Bayesian Distribution Regression.
(
Contributed Talk
)
link »
Authors: Ho Chung Leon Law, Dougal J. Sutherland, Dino Sejdinovic, Seth Flaxman. |
🔗 |
Fri 10:10 a.m. - 11:00 a.m.
|
Learning from Conditional Distributions via Dual Embeddings (poster).
(
Poster Session I & Coffee
)
link »
Authors: Bo Dai, Niao He, Yunpeng Pan, Byron Boots, Le Song. Poster session continues at 14:50 - 15:50. |
Le Song 🔗 |
Fri 10:10 a.m. - 11:00 a.m.
|
Convolutional Layers based on Directed Multi-Graphs (poster).
(
Poster Session I & Coffee
)
link »
Author: Tomasz Arodz. Poster session continues at 14:50 - 15:50. |
Tom Arodz 🔗 |
Fri 10:10 a.m. - 11:00 a.m.
|
Squared Earth Mover's Distance Loss for Training Deep Neural Networks on Ordered-Classes (poster).
(
Poster Session I & Coffee
)
link »
Authors: Le Hou, Chen-Ping Yu, Dimitris Samaras. Poster session continues at 14:50 - 15:50. |
Le Hou 🔗 |
Fri 10:10 a.m. - 11:00 a.m.
|
Graph based Feature Selection for Structured High Dimensional Data (poster).
(
Poster Session I & Coffee
)
link »
Authors: Thosini K. Bamunu Mudiyanselage, Yanqing Zhang. Poster session continues at 14:50 - 15:50. |
Yanqing Zhang 🔗 |
Fri 10:10 a.m. - 11:00 a.m.
|
Kernels on Fuzzy Sets: an Overview (poster).
(
Poster Session I & Coffee
)
link »
Author: Jorge Luis Guevara Diaz. Poster session continues at 14:50 - 15:50. |
Jorge Luis Guevara Diaz 🔗 |
Fri 10:10 a.m. - 11:00 a.m.
|
Learning from Graphs with Structural Variation (poster).
(
Poster Session I & Coffee
)
link »
Authors: Rune K. Nielsen, Aasa Feragen, Andreas Holm. Poster session continues at 14:50 - 15:50. |
Andreas Nugaard Holm · Rune Nielsen 🔗 |
Fri 10:10 a.m. - 11:00 a.m.
|
The Geometric Block Model (poster).
(
Poster Session I & Coffee
)
link »
Authors: Sainyam Galhotra, Arya Mazumdar, Soumyabrata Pal, Barna Saha. Poster session continues at 14:50 - 15:50. |
Soumyabrata Pal 🔗 |
Fri 10:10 a.m. - 11:00 a.m.
|
Worst-case vs. Average-case Design for Estimation from Fixed Pairwise Comparisons (poster).
(
Poster Session I & Coffee
)
link »
Authors: Ashwin Pananjady, Cheng Mao, Vidya Muthukumar, Martin Wainwright, Thomas Courtade. Poster session continues at 14:50 - 15:50. |
🔗 |
Fri 10:10 a.m. - 11:00 a.m.
|
Post Selection Inference with Maximum Mean Discrepancy (poster).
(
Poster Session I & Coffee
)
link »
Authors: Denny Wu, Makoto Yamada, Ichiro Takeuchi, Kenji Fukumizu. Poster session continues at 14:50 - 15:50. |
🔗 |
Fri 10:10 a.m. - 11:00 a.m.
|
Algorithmic and Statistical Aspects of Linear Regression without Correspondence (poster).
(
Poster Session I & Coffee
)
link »
Authors: Daniel Hsu, Kevin Shi, Xiaorui Sun. Poster session continues at 14:50 - 15:50. |
🔗 |
Fri 10:10 a.m. - 11:00 a.m.
|
Large Scale Graph Learning from Smooth Signals (poster).
(
Poster Session I & Coffee
)
link »
Authors: Vassilis Kalofolias, Nathanael Perraudin. Poster session continues at 14:50 - 15:50. |
🔗 |
Fri 10:10 a.m. - 11:00 a.m.
|
Differentially Private Database Release via Kernel Mean Embeddings (poster).
(
Poster Session I & Coffee
)
link »
Authors: Matej Balog, Ilya Tolstikhin, Bernhard Schölkopf. Poster session continues at 14:50 - 15:50. |
🔗 |
Fri 10:10 a.m. - 11:00 a.m.
|
When is Network Lasso Accurate: The Vector Case (poster).
(
Poster Session I & Coffee
)
link »
Authors: Nguyen Quang Tran, Alexander Jung, Saeed Basirian. Poster session continues at 14:50 - 15:50. |
🔗 |
Fri 10:10 a.m. - 11:00 a.m.
|
Bayesian Distribution Regression (poster).
(
Poster Session I & Coffee
)
link »
Authors: Ho Chung Leon Law, Dougal J Sutherland, Dino Sejdinovic, Seth Flaxman. Poster session continues at 14:50 - 15:50. |
🔗 |
Fri 10:10 a.m. - 11:00 a.m.
|
The Weighted Kendall Kernel (poster).
(
Poster Session I & Coffee
)
link »
Authors: Yunlong Jiao, Jean-Philippe Vert. Poster session continues at 14:50 - 15:50. |
🔗 |
Fri 10:10 a.m. - 11:00 a.m.
|
On Kernel Methods for Covariates that are Rankings (poster).
(
Poster Session I & Coffee
)
link »
Authors: Horia Mania, Aaditya Ramdas, Martin Wainwright, Michael Jordan, Benjamin Recht. Poster session continues at 14:50 - 15:50. |
🔗 |
Fri 11:00 a.m. - 11:20 a.m.
|
When is Network Lasso Accurate: The Vector Case.
(
Contributed Talk
)
link »
Authors: Nguyen Quang Tran, Alexander Jung, Saeed Basirian. |
Nguyen Tran 🔗 |
Fri 11:20 a.m. - 11:40 a.m.
|
Worst-case vs. Average-case Design for Estimation from Fixed Pairwise Comparisons.
(
Contributed Talk
)
link »
Authors: Ashwin Pananjady, Cheng Mao, Vidya Muthukumar, Martin Wainwright, Thomas Courtade. |
🔗 |
Fri 11:40 a.m. - 12:00 p.m.
|
The Weighted Kendall Kernel.
(
Contributed Talk
)
link »
Authors: Yunlong Jiao, Jean-Philippe Vert. |
🔗 |
Fri 12:00 p.m. - 12:20 p.m.
|
On Kernel Methods for Covariates that are Rankings.
(
Contributed Talk
)
link »
Authors: Horia Mania, Aaditya Ramdas, Martin Wainwright, Michael Jordan, Benjamin Recht. |
🔗 |
Fri 12:20 p.m. - 1:50 p.m.
|
Lunch Break
|
🔗 |
Fri 1:50 p.m. - 2:20 p.m.
|
Learning on topological and geometrical structures of data.
(
Talk
)
link »
Topological data analysis (TDA) is a recent methodology for extracting topological and geometrical features from complex geometric data structures. Persistent homology, a new mathematical notion proposed by Edelsbrunner (2002), provides a multiscale descriptor for the topology of data, and has been recently applied to a variety of data analysis. In this talk I will introduce a machine learning framework of TDA by combining persistence homology and kernel methods. As an expression of persistent homology, persistence diagrams are widely used to express the lifetimes of generators of homology groups. While they serve as a compact representation of data, it is not straightforward to apply standard data analysis to persistence diagrams, since they consist of a set of points in 2D space expressing the lifetimes. We introduce a method of kernel embedding of the persistence diagrams to obtain their vector representation, which enables one to apply any kernel methods in topological data analysis, and propose a persistence weighted Gaussian kernel as a suitable kernel for vectorization of persistence diagrams. Some theoretical properties including Lipschitz continuity of the embedding are also discussed. I will also present applications to change point detection and time series analysis in the field of material sciences and biochemistry. |
Kenji Fukumizu 🔗 |
Fri 2:20 p.m. - 2:50 p.m.
|
Operator-valued kernels and their application to functional data analysis.
(
Talk
)
link »
Positive semidefinite operator-valued kernel generalizes the well-known notion of reproducing kernel, and is a main concept underlying many kernel-based vector-valued learning algorithms. In this talk I will give a brief introduction to learning with operator-valued kernels, discuss current challenges in the field, and describe convenient schemes to overcome them. I'll overview our recent work on learning with functional data in the case where both attributes and labels are functions. In this setting, a set of rigorously defined infinite-dimensional operator-valued kernels that can be valuably applied when the data are functions is described, and a learning scheme for nonlinear functional data analysis is introduced. The methodology is illustrated through speech and audio signal processing experiments. |
Hachem Kadri 🔗 |
Fri 2:50 p.m. - 3:50 p.m.
|
Poster Session II & Coffee
|
🔗 |
Fri 3:50 p.m. - 4:20 p.m.
|
Distribution Regression and its Applications.
(
Talk
)
link »
The most common machine learning algorithms operate on finite-dimensional vectorial feature representations. In many applications, however, the natural representation of the data consists of distributions, sets, and other complex objects rather than finite-dimensional vectors. In this talk we will review machine learning algorithms that can operate directly on these complex objects. We will discuss applications in various scientific problems including estimating the cosmological parameters of our Universe, dynamical mass measurements of galaxy clusters, finding anomalous events in fluid dynamics, and estimating phenotypes in agriculturally important plants. |
Barnabas Poczos 🔗 |
Fri 4:20 p.m. - 4:50 p.m.
|
Covariant Compositional Networks for Learning Graphs
(
Talk
)
link »
Most existing neural networks for learning graphs deal with the issue of permutation invariance by conceiving of the network as a message passing scheme, where each node sums the feature vectors coming from its neighbors. We argue that this imposes a limitation on their representation power, and instead propose a new general architecture for representing objects consisting of a hierarchy of parts, which we call covariant compositional networks (CCNs). Here covariance means that the activation of each neuron must transform in a specific way under permutations, similarly to steerability in CNNs. We achieve covariance by making each activation transform according to a tensor representation of the permutation group, and derive the corresponding tensor aggregation rules that each neuron must implement. Experiments show that CCNs can outperform competing methods on some standard graph learning benchmarks. |
Risi Kondor 🔗 |
Author Information
Florence d'Alché-Buc (LTCI,Télécom ParisTech, University of Paris-Saclay)
Krikamol Muandet (Mahidol University)
Bharath Sriperumbudur (Penn State University)
Zoltán Szabó (École Polytechnique)
[Homepage](http://www.cmap.polytechnique.fr/~zoltan.szabo/)
More from the Same Authors
-
2020 Poster: Robust Persistence Diagrams using Reproducing Kernels »
Siddharth Vishwanath · Kenji Fukumizu · Satoshi Kuriki · Bharath Sriperumbudur -
2018 Poster: A Structured Prediction Approach for Label Ranking »
Anna Korba · Alexandre Garcia · Florence d'Alché-Buc -
2017 Poster: A Linear-Time Kernel Goodness-of-Fit Test »
Wittawat Jitkrittum · Wenkai Xu · Zoltan Szabo · Kenji Fukumizu · Arthur Gretton -
2017 Oral: A Linear-Time Kernel Goodness-of-Fit Test »
Wittawat Jitkrittum · Wenkai Xu · Zoltan Szabo · Kenji Fukumizu · Arthur Gretton -
2016 Workshop: Adaptive and Scalable Nonparametric Methods in Machine Learning »
Aaditya Ramdas · Arthur Gretton · Bharath Sriperumbudur · Han Liu · John Lafferty · Samory Kpotufe · Zoltán Szabó -
2016 Oral: Interpretable Distribution Features with Maximum Testing Power »
Wittawat Jitkrittum · Zoltán Szabó · Kacper P Chwialkowski · Arthur Gretton -
2016 Poster: Minimax Estimation of Maximum Mean Discrepancy with Radial Kernels »
Ilya Tolstikhin · Bharath Sriperumbudur · Bernhard Schölkopf -
2016 Poster: Interpretable Distribution Features with Maximum Testing Power »
Wittawat Jitkrittum · Zoltán Szabó · Kacper P Chwialkowski · Arthur Gretton -
2016 Poster: Joint quantile regression in vector-valued RKHSs »
Maxime Sangnier · Olivier Fercoq · Florence d'Alché-Buc -
2016 Poster: Convergence guarantees for kernel-based quadrature rules in misspecified settings »
Motonobu Kanagawa · Bharath Sriperumbudur · Kenji Fukumizu -
2015 Poster: Optimal Rates for Random Fourier Features »
Bharath Sriperumbudur · Zoltan Szabo -
2015 Spotlight: Optimal Rates for Random Fourier Features »
Bharath Sriperumbudur · Zoltan Szabo -
2014 Workshop: Modern Nonparametrics 3: Automating the Learning Pipeline »
Eric Xing · Mladen Kolar · Arthur Gretton · Samory Kpotufe · Han Liu · Zoltán Szabó · Alan Yuille · Andrew G Wilson · Ryan Tibshirani · Sasha Rakhlin · Damian Kozbur · Bharath Sriperumbudur · David Lopez-Paz · Kirthevasan Kandasamy · Francesco Orabona · Andreas Damianou · Wacha Bounliphone · Yanshuai Cao · Arijit Das · Yingzhen Yang · Giulia DeSalvo · Dmitry Storcheus · Roberto Valerio -
2014 Poster: Kernel Mean Estimation via Spectral Filtering »
Krikamol Muandet · Bharath Sriperumbudur · Bernhard Schölkopf -
2012 Poster: Optimal kernel choice for large-scale two-sample tests »
Arthur Gretton · Bharath Sriperumbudur · Dino Sejdinovic · Heiko Strathmann · Sivaraman Balakrishnan · Massimiliano Pontil · Kenji Fukumizu -
2011 Poster: Learning in Hilbert vs. Banach Spaces: A Measure Embedding Viewpoint »
Bharath Sriperumbudur · Kenji Fukumizu · Gert Lanckriet -
2009 Poster: Kernel Choice and Classifiability for RKHS Embeddings of Probability Distributions »
Bharath Sriperumbudur · Kenji Fukumizu · Arthur Gretton · Gert Lanckriet · Bernhard Schölkopf -
2009 Oral: Kernel Choice and Classifiability for RKHS Embeddings of Probability Distributions »
Bharath Sriperumbudur · Kenji Fukumizu · Arthur Gretton · Gert Lanckriet · Bernhard Schölkopf -
2009 Poster: On the Convergence of the Concave-Convex Procedure »
Bharath Sriperumbudur · Gert Lanckriet -
2009 Poster: A Fast, Consistent Kernel Two-Sample Test »
Arthur Gretton · Kenji Fukumizu · Zaid Harchaoui · Bharath Sriperumbudur -
2009 Spotlight: A Fast, Consistent Kernel Two-Sample Test »
Arthur Gretton · Kenji Fukumizu · Zaid Harchaoui · Bharath Sriperumbudur -
2008 Poster: Characteristic Kernels on Groups and Semigroups »
Kenji Fukumizu · Bharath Sriperumbudur · Arthur Gretton · Bernhard Schölkopf -
2008 Oral: Characteristic Kernels on Groups and Semigroups »
Kenji Fukumizu · Bharath Sriperumbudur · Arthur Gretton · Bernhard Schölkopf