Timezone: »
We tackle the problem of predicting the number of optimization steps that a pre-trained deep network needs to converge to a given value of the loss function. To do so, we leverage the fact that the training dynamics of a deep network during fine-tuning are well approximated by those of a linearized model. This allows us to approximate the training loss and accuracy at any point during training by solving a low-dimensional Stochastic Differential Equation (SDE) in function space. Using this result, we are able to predict the time it takes for Stochastic Gradient Descent (SGD) to fine-tune a model to a given loss without having to perform any training.
In our experiments, we are able to predict training time of a ResNet within a 20\% error margin on a variety of datasets and hyper-parameters, at a 30 to 45-fold reduction in cost compared to actual training. We also discuss how to further reduce the computational and memory cost of our method, and in particular we show that by exploiting the spectral properties of the gradients' matrix it is possible to predict training time on a large dataset while processing only a subset of the samples.
Author Information
Luca Zancato (University of Padova)
Alessandro Achille (Amazon Web Services)
Avinash Ravichandran (AWS)
Rahul Bhotika (Amazon Web Services)
Stefano Soatto (UCLA)
More from the Same Authors
-
2021 Spotlight: Uniform Sampling over Episode Difficulty »
Sébastien Arnold · Guneet Dhillon · Avinash Ravichandran · Stefano Soatto -
2023 Poster: Your representations are in the network: composable and parallel adaptation for large scale models »
Yonatan Dukler · Alessandro Achille · Hao Yang · Varsha Vivek · Luca Zancato · Benjamin Bowman · Avinash Ravichandran · Charless Fowlkes · Ashwin Swaminathan · Stefano Soatto -
2023 Poster: Leveraging sparse and shared feature activations for disentangled representation learning »
Marco Fumero · Florian Wenzel · Luca Zancato · Alessandro Achille · Emanuele Rodolà · Stefano Soatto · Bernhard Schölkopf · Francesco Locatello -
2022 Poster: Semi-supervised Vision Transformers at Scale »
Zhaowei Cai · Avinash Ravichandran · Paolo Favaro · Manchen Wang · Davide Modolo · Rahul Bhotika · Zhuowen Tu · Stefano Soatto -
2021 Poster: On Plasticity, Invariance, and Mutually Frozen Weights in Sequential Task Learning »
Julian Zilly · Alessandro Achille · Andrea Censi · Emilio Frazzoli -
2021 Poster: Uniform Sampling over Episode Difficulty »
Sébastien Arnold · Guneet Dhillon · Avinash Ravichandran · Stefano Soatto -
2020 Workshop: Deep Learning through Information Geometry »
Pratik Chaudhari · Alexander Alemi · Varun Jog · Dhagash Mehta · Frank Nielsen · Stefano Soatto · Greg Ver Steeg -
2020 Poster: Geo-PIFu: Geometry and Pixel Aligned Implicit Functions for Single-view Human Reconstruction »
Tong He · John Collomosse · Hailin Jin · Stefano Soatto -
2020 Poster: Targeted Adversarial Perturbations for Monocular Depth Prediction »
Alex Wong · Safa Cicek · Stefano Soatto -
2019 : Invited Talk: Stefano Soatto and Alessandro Achille »
Stefano Soatto · Alessandro Achille -
2019 Poster: Time Matters in Regularizing Deep Networks: Weight Decay and Data Augmentation Affect Early Learning Dynamics, Matter Little Near Convergence »
Aditya Sharad Golatkar · Alessandro Achille · Stefano Soatto -
2018 : Poster Session »
Sujay Sanghavi · Vatsal Shah · Yanyao Shen · Tianchen Zhao · Yuandong Tian · Tomer Galanti · Mufan Li · Gilad Cohen · Daniel Rothchild · Aristide Baratin · Devansh Arpit · Vagelis Papalexakis · Michael Perlmutter · Ashok Vardhan Makkuva · Pim de Haan · Yingyan Lin · Wanmo Kang · Cheolhyoung Lee · Hao Shen · Sho Yaida · Dan Roberts · Nadav Cohen · Philippe Casgrain · Dejiao Zhang · Tengyu Ma · Avinash Ravichandran · Julian Emilio Salazar · Bo Li · Davis Liang · Christopher Wong · Glen Bigan Mbeng · Animesh Garg -
2017 : Stefano Soatto »
Stefano Soatto -
2012 Poster: Controlled Recognition Bounds for Visual Learning and Exploration »
Vasiliy Karasev · Chiuso Alessandro c/o Dip. I Informazione · Stefano Soatto -
2011 Poster: Multiple Instance Filtering »
Kamil A Wnuk · Stefano Soatto