Timezone: »

Pseudo-Spherical Contrastive Divergence
Lantao Yu · Jiaming Song · Yang Song · Stefano Ermon

Wed Dec 08 04:30 PM -- 06:00 PM (PST) @
Energy-based models (EBMs) offer flexible distribution parametrization. However, due to the intractable partition function, they are typically trained via contrastive divergence for maximum likelihood estimation. In this paper, we propose pseudo-spherical contrastive divergence (PS-CD) to generalize maximum likelihood learning of EBMs. PS-CD is derived from the maximization of a family of strictly proper homogeneous scoring rules, which avoids the computation of the intractable partition function and provides a generalized family of learning objectives that include contrastive divergence as a special case. Moreover, PS-CD allows us to flexibly choose various learning objectives to train EBMs without additional computational cost or variational minimax optimization. Theoretical analysis on the proposed method and extensive experiments on both synthetic data and commonly used image datasets demonstrate the effectiveness and modeling flexibility of PS-CD, as well as its robustness to data contamination, thus showing its superiority over maximum likelihood and $f$-EBMs.

Author Information

Lantao Yu (Stanford University)
Jiaming Song (Stanford University)

I am a first year Ph.D. student in Stanford University. I think about problems in machine learning and deep learning under the supervision of Stefano Ermon. I did my undergrad at Tsinghua University, where I was lucky enough to collaborate with Jun Zhu and Lawrence Carin on scalable Bayesian machine learning.

Yang Song (Stanford University)
Stefano Ermon (Stanford)

More from the Same Authors