Keywords: [ ReScience - MLRC 2021 ] [ Journal Track ]
The claims of the paper are threefold: (1) Cecilia made the surprising yet intriguing discovery that all sources of nondeterminism exhibit a similar degree of variability in the model performance of a neural network throughout the training process. (2) To explain this fact, they have identified model instability during training as the key factor contributing to this phenomenon. (3) They have also proposed two approaches (Accelerated Ensembling and Test-Time Data Augmentation) to mitigate the impact on run-to-run variability without incurring additional training costs. In the paper, the experiments were performed on two types of datasets (image classification and language modelling). However, due to intensive training and time required for each experiment, we will only consider image classification for testing all three claims.