Timezone: »

Invariance Learning in Deep Neural Networks with Differentiable Laplace Approximations
Alexander Immer · Tycho van der Ouderaa · Gunnar Rätsch · Vincent Fortuin · Mark van der Wilk

Wed Nov 30 09:00 AM -- 11:00 AM (PST) @ Hall J #428

Data augmentation is commonly applied to improve performance of deep learning by enforcing the knowledge that certain transformations on the input preserve the output. Currently, the data augmentation parameters are chosen by human effort and costly cross-validation, which makes it cumbersome to apply to new datasets. We develop a convenient gradient-based method for selecting the data augmentation without validation data during training of a deep neural network. Our approach relies on phrasing data augmentation as an invariance in the prior distribution on the functions of a neural network, which allows us to learn it using Bayesian model selection. This has been shown to work in Gaussian processes, but not yet for deep neural networks. We propose a differentiable Kronecker-factored Laplace approximation to the marginal likelihood as our objective, which can be optimised without human supervision or validation data. We show that our method can successfully recover invariances present in the data, and that this improves generalisation and data efficiency on image datasets.

Author Information

Alexander Immer (ETH Zurich, MPI IS)
Tycho van der Ouderaa (Imperial College London)

The main topic of my PhD is learning structure and inductive biases in neural networks. The focus has been on learning symmetry from data, such as equivariance and invariance. The aim of the research is to make learning inductive biases and structure in machine learning models as easy as learning the weights.

Gunnar Rätsch (ETHZ)
Vincent Fortuin (University of Cambridge Helmholtz AI)
Mark van der Wilk (Imperial College London)

More from the Same Authors