Workshop: I Can’t Believe It’s Not Better: Understanding Deep Learning Through Empirical Falsification

An Empirical Analysis of the Advantages of Finite v.s. Infinite Width Bayesian Neural Networks

Jiayu Yao · Yaniv Yacoby · Beau Coker · Weiwei Pan · Finale Doshi-Velez


Understanding the relative advantages of finite versus infinite-width neural networks (NNs) is important for model selection. However, comparing NNs with different widths is challenging because, as the width increases, multiple model properties change simultaneously -- model capacity increases while the model becomes less flexible in learning features from the data. Analyses of Bayesian neural networks (BNNs) is even more difficult because inference in the finite width case is intractable. In this work, we empirically compare finite and infinite width BNNs, and provide quantitative and qualitative explanations for their performance difference. We find that when the limiting model is mis-specified, increasing the width can reduce the generalization performance of BNNs. In these cases, we provide evidence that finite BNNs generalize better partially due to the properties of their frequency spectrum that allows them to adapt under model mismatch.

Chat is not available.