Timezone: »
A number of recent studies of continuous variational autoencoder (VAE) models have noted, either directly or indirectly, the tendency of various parameter gradients to drift towards infinity during training. Because such gradients could potentially contribute to numerical instabilities, and are often framed as a problematic phenomena to be avoided, it may be tempting to shift to alternative energy functions that guarantee bounded gradients. But it remains an open question: What might the unintended consequences of such a restriction be? To address this issue, we examine how unbounded gradients relate to the regularization of a broad class of autoencoder-based architectures, including VAE models, as applied to data lying on or near a low-dimensional manifold (e.g., natural images). Our main finding is that, if the ultimate goal is to simultaneously avoid over-regularization (high reconstruction errors, sometimes referred to as posterior collapse) and under-regularization (excessive latent dimensions are not pruned from the model), then an autoencoder-based energy function with infinite gradients around optimal representations is provably required per a certain technical sense which we carefully detail. Given that both over- and under-regularization can directly lead to poor generated sample quality or suboptimal feature selection, this result suggests that heuristic modifications to or constraints on the VAE energy function may at times be ill-advised, and large gradients should be accommodated to the extent possible.
Author Information
Bin Dai (Samsung Research China - Beijing)
Li Wenliang (University College London)
David Wipf (Microsoft Research)
Related Events (a corresponding poster, oral, or spotlight)
-
2021 Spotlight: On the Value of Infinite Gradients in Variational Autoencoder Models »
Dates n/a. Room
More from the Same Authors
-
2021 Poster: A Biased Graph Neural Network Sampler with Near-Optimal Regret »
Qingru Zhang · David Wipf · Quan Gan · Le Song -
2021 Poster: GRIN: Generative Relation and Intention Network for Multi-agent Trajectory Prediction »
Longyuan Li · Jian Yao · Li Wenliang · Tong He · Tianjun Xiao · Junchi Yan · David Wipf · Zheng Zhang -
2021 Poster: From Canonical Correlation Analysis to Self-supervised Graph Neural Networks »
Hengrui Zhang · Qitian Wu · Junchi Yan · David Wipf · Philip S Yu -
2020 Poster: Further Analysis of Outlier Detection with Deep Generative Models »
Ziyu Wang · Bin Dai · David P Wipf · Jun Zhu -
2017 Oral: From Bayesian Sparsity to Gated Recurrent Nets »
Hao He · Bo Xin · Satoshi Ikehata · David Wipf -
2017 Poster: From Bayesian Sparsity to Gated Recurrent Nets »
Hao He · Bo Xin · Satoshi Ikehata · David Wipf -
2016 Poster: A Pseudo-Bayesian Algorithm for Robust PCA »
Tae-Hyun Oh · Yasuyuki Matsushita · In So Kweon · David Wipf -
2016 Poster: Maximal Sparsity with Deep Networks? »
Bo Xin · Yizhou Wang · Wen Gao · David Wipf · Baoyuan Wang -
2013 Poster: Non-Uniform Camera Shake Removal Using a Spatially-Adaptive Sparse Penalty »
Haichao Zhang · David Wipf -
2013 Oral: Non-Uniform Camera Shake Removal Using a Spatially-Adaptive Sparse Penalty »
Haichao Zhang · David Wipf