Timezone: »
Ensembling is now recognized as an effective approach for increasing the predictive performance and calibration of deep networks. We introduce a new approach, Parameter Ensembling by Perturbation (PEP), that constructs an ensemble of parameter values as random perturbations of the optimal parameter set from training by a Gaussian with a single variance parameter. The variance is chosen to maximize the log-likelihood of the ensemble average (𝕃) on the validation data set. Empirically, and perhaps surprisingly, 𝕃 has a well-defined maximum as the variance grows from zero (which corresponds to the baseline model). Conveniently, calibration level of predictions also tends to grow favorably until the peak of 𝕃 is reached. In most experiments, PEP provides a small improvement in performance, and, in some cases, a substantial improvement in empirical calibration. We show that this "PEP effect'' (the gain in log-likelihood) is related to the mean curvature of the likelihood function and the empirical Fisher information. Experiments on ImageNet pre-trained networks including ResNet, DenseNet, and Inception showed improved calibration and likelihood. We further observed a mild improvement in classification accuracy on these networks. Experiments on classification benchmarks such as MNIST and CIFAR-10 showed improved calibration and likelihood, as well as the relationship between the PEP effect and overfitting; this demonstrates that PEP can be used to probe the level of overfitting that occurred during training. In general, no special training procedure or network architecture is needed, and in the case of pre-trained networks, no additional training is needed.
Author Information
Alireza Mehrtash (University of British Columbia)
Purang Abolmaesumi (UBC)
Polina Golland (Massachusetts Institute of Technology)
Tina Kapur (Brigham and Women's Hospital)
Demian Wassermann (Inria)
William Wells (Harvard Medical School)
More from the Same Authors
-
2021 : Bayesian Image Reconstruction using Deep Generative Models »
Razvan Marinescu · Daniel Moyer · Polina Golland -
2022 : Deployment of deep models for intra-operative margin assessment using mass spectrometry »
Amoon Jamzad · Laura Connolly · Fahimeh Fooladgar · Martin Kaufmann · Kevin Yi Mi Ren · Shaila Merchant · Jay Engel · Sonal Varma · Purang Abolmaesumi · Gabor Fichtinger · John Rudan · Parvin Mousavi -
2022 : Session 2 Keynote 1 »
Purang Abolmaesumi -
2021 : Bayesian Image Reconstruction using Deep Generative Models »
Razvan Marinescu · Daniel Moyer · Polina Golland -
2012 Poster: Identification of Recurrent Patterns in the Activation of Brain Networks »
Firdaus Janoos · Weichang Li · Niranjan Subrahmanya · Istvan Morocz · William Wells -
2012 Spotlight: Identification of Recurrent Patterns in the Activation of Brain Networks »
Firdaus Janoos · Weichang Li · Niranjan Subrahmanya · Istvan Morocz · William Wells -
2010 Spotlight: Functional Geometry Alignment and Localization of Brain Areas »
Georg Langs · Yanmei Tie · Laura Rigolo · Alexandra Golby · Polina Golland -
2010 Poster: Functional Geometry Alignment and Localization of Brain Areas »
Georg Langs · Yanmei Tie · Laura Rigolo · Alexandra Golby · Polina Golland -
2010 Poster: Categories and Functional Units: An Infinite Hierarchical Model for Brain Activations »
Danial Lashkari · Ramesh Sridharan · Polina Golland -
2007 Spotlight: Convex Clustering with Exemplar-Based Models »
Danial Lashkari · Polina Golland -
2007 Poster: Convex Clustering with Exemplar-Based Models »
Danial Lashkari · Polina Golland