Timezone: »

 
Poster
Wasserstein Training of Restricted Boltzmann Machines
Grégoire Montavon · Klaus-Robert Müller · Marco Cuturi

Wed Dec 07 09:00 AM -- 12:30 PM (PST) @ Area 5+6+7+8 #137

Boltzmann machines are able to learn highly complex, multimodal, structured and multiscale real-world data distributions. Parameters of the model are usually learned by minimizing the Kullback-Leibler (KL) divergence from training samples to the learned model. We propose in this work a novel approach for Boltzmann machine training which assumes that a meaningful metric between observations is given. This metric can be represented by the Wasserstein distance between distributions, for which we derive a gradient with respect to the model parameters. Minimization of this new objective leads to generative models with different statistical properties. We demonstrate their practical potential on data completion and denoising, for which the metric between observations plays a crucial role.

Author Information

Grégoire Montavon (TU Berlin)
Klaus-Robert Müller (TU Berlin)
Marco Cuturi (Apple)

Marco Cuturi is a research scientist at Apple, in Paris. He received his Ph.D. in 11/2005 from the Ecole des Mines de Paris in applied mathematics. Before that he graduated from National School of Statistics (ENSAE) with a master degree (MVA) from ENS Cachan. He worked as a post-doctoral researcher at the Institute of Statistical Mathematics, Tokyo, between 11/2005 and 3/2007 and then in the financial industry between 4/2007 and 9/2008. After working at the ORFE department of Princeton University as a lecturer between 2/2009 and 8/2010, he was at the Graduate School of Informatics of Kyoto University between 9/2010 and 9/2016 as a tenured associate professor. He joined ENSAE in 9/2016 as a professor, where he is now working part-time. He was at Google between 10/2018 and 1/2022. His main employment is now with Apple, since 1/2022, as a research scientist working on fundamental aspects of machine learning.

More from the Same Authors