Timezone: »

Energy-Based Models: Structured Learning Beyond Likelihoods
Yann LeCun

Mon Dec 04 03:30 PM -- 05:30 PM (PST) @ Regency F
Event URL: http://www.cs.nyu.edu/~yann/talks/tutorial-nips-2006.html »

Energy-Based Models (EBM) capture dependencies between variables by associating a scalar energy to each configuration of the variables. Given a set of observed variables, an EBM inference consists in finding configurations of unobserved variables that minimize the energy. Training an EBM consists in designing a loss function whose minimization will shape the energy surface so that desired variable configurations have lower energies than undesired configurations. EBM approaches have been applied with considerable success to such problems as natural language processing, biological sequence analysis, computer vision (object detection and recognition), image segmentation, image restoration, unsupervised feature learning, and dimensionality reduction.

The first part of the tutorial will introduce the concepts of energy-based inference, will discuss the relationships with non-probabilistic forms of graphical models (un-normalized factor graphs), and will give the conditions that the loss function must satisfy so that its minimization will cause the model to produce good decisions. The second part will discuss the relative merits of EBM approaches and probabilistic approaches. EBMs provide more flexibility than probabilistic approaches in the design of the energy function because of the absence of normalization. More importantly, when training complex probabilistic models, one is often faced with the problem of evaluating (or approximating) intractable sums or integrals. EBMs trained with appropriate loss functions sidestep this problem altogether. The third part will present several popular learning models in the light of the EBM framework. In particular, discriminative learning methods for "structured" outputs will be discussed including: discriminative HMMs, Graph Transformer Networks, Conditional Random Fields, Maximum Margin Markov Networks, and related approaches. A simple interpretation will be given for several approximate maximum likelihood methods such as products of experts models, variational bound methods, and Hinton's Contrastive Divergence. Lastly, a number of applications to vision, NLP and bio-informatics will be discussed.

Author Information

Yann LeCun (Facebook)

Yann LeCun is VP & Chief AI Scientist at Meta and Silver Professor at NYU affiliated with the Courant Institute of Mathematical Sciences & the Center for Data Science. He was the founding Director of FAIR (Meta's AI Research group) and of the NYU Center for Data Science. He received an Engineering Diploma from ESIEE (Paris) and a PhD from Sorbonne Université. After a postdoc in Toronto he joined AT&T Bell Labs in 1988, and AT&T Labs in 1996 as Head of Image Processing Research. He joined NYU as a professor in 2003 and Facebook in 2013. His interests include AI machine learning, computer perception, robotics and computational neuroscience. He is the recipient of the 2018 ACM Turing Award (with Geoffrey Hinton and Yoshua Bengio) for "conceptual and engineering breakthroughs that have made deep neural networks a critical component of computing", a member of the National Academy of Sciences, the National Academy of Engineering and a Chevalier de la Légion d’Honneur.

More from the Same Authors