Timezone: »
Residual Networks (ResNets) have demonstrated significant improvement over traditional Convolutional Neural Networks (CNNs) on image classification, increasing in performance as networks grow both deeper and wider. However, memory consumption becomes a bottleneck as one needs to store all the intermediate activations for calculating gradients using backpropagation. In this work, we present the Reversible Residual Network (RevNet), a variant of ResNets where each layer's activations can be reconstructed exactly from the next layer's. Therefore, the activations for most layers need not be stored in memory during backprop. We demonstrate the effectiveness of RevNets on CIFAR and ImageNet, establishing nearly identical performance to equally-sized ResNets, with activation storage requirements independent of depth.
Author Information
Aidan Gomez (University of Toronto)
Mengye Ren (University of Toronto)
Raquel Urtasun (University of Toronto)
Roger Grosse (University of Toronto)
More from the Same Authors
-
2020 : Exploring Representation Learning for Flexible Few-Shot Tasks »
Mengye Ren -
2022 Poster: Amortized Proximal Optimization »
Juhan Bae · Paul Vicol · Jeff Z. HaoChen · Roger Grosse -
2022 Poster: Proximal Learning With Opponent-Learning Awareness »
Stephen Zhao · Chris Lu · Roger Grosse · Jakob Foerster -
2022 Poster: If Influence Functions are the Answer, Then What is the Question? »
Juhan Bae · Nathan Ng · Alston Lo · Marzyeh Ghassemi · Roger Grosse -
2022 Poster: Path Independent Equilibrium Models Can Better Exploit Test-Time Computation »
Cem Anil · Ashwini Pokle · Kaiqu Liang · Johannes Treutlein · Yuhuai Wu · Shaojie Bai · J. Zico Kolter · Roger Grosse -
2021 Poster: Differentiable Annealed Importance Sampling and the Perils of Gradient Noise »
Guodong Zhang · Kyle Hsu · Jianing Li · Chelsea Finn · Roger Grosse -
2020 : Invited Talk: Roger Grosse - Why Isn’t Everyone Using Second-Order Optimization? »
Roger Grosse -
2020 Poster: Delta-STN: Efficient Bilevel Optimization for Neural Networks using Structured Response Jacobians »
Juhan Bae · Roger Grosse -
2020 Poster: Regularized linear autoencoders recover the principal components, eventually »
Xuchan Bao · James Lucas · Sushant Sachdeva · Roger Grosse -
2019 : Poster Session 2 »
Mayur Saxena · Nicholas Frosst · Vivien Cabannes · Gene Kogan · Austin Dill · Anurag Sarkar · Joel Ruben Antony Moniz · Vibert Thio · Scott Sievert · Lia Coleman · Frederik De Bleser · Brian Quanz · Jonathon Kereliuk · Panos Achlioptas · Mohamed Elhoseiny · Songwei Ge · Aidan Gomez · Jamie Brew -
2019 : Carl Doersch, Raquel Urtasun, Sanja Fidler moderated by Natalia Neverova »
Raquel Urtasun · Sanja Fidler · Natalia Neverova · Ilija Radosavovic · Carl Doersch -
2019 : Raquel Urtasun - Science and Engineering for Self-driving »
Raquel Urtasun -
2019 : Coffee Break & Poster Session 1 »
Yan Zhang · Jonathon Hare · Adam Prugel-Bennett · Po Leung · Patrick Flaherty · Pitchaya Wiratchotisatian · Alessandro Epasto · Silvio Lattanzi · Sergei Vassilvitskii · Morteza Zadimoghaddam · Theja Tulabandhula · Fabian Fuchs · Adam Kosiorek · Ingmar Posner · William Hang · Anna Goldie · Sujith Ravi · Azalia Mirhoseini · Yuwen Xiong · Mengye Ren · Renjie Liao · Raquel Urtasun · Haici Zhang · Michele Borassi · Shengda Luo · Andrew Trapp · Geoffroy Dubourg-Felonneau · Yasmeen Kussad · Christopher Bender · Manzil Zaheer · Junier Oliva · Michał Stypułkowski · Maciej Zieba · Austin Dill · Chun-Liang Li · Songwei Ge · Eunsu Kang · Oiwi Parker Jones · Kelvin Ka Wing Wong · Joshua Payne · Yang Li · Azade Nazi · Erkut Erdem · Aykut Erdem · Kevin O'Connor · Juan J Garcia · Maciej Zamorski · Jan Chorowski · Deeksha Sinha · Harry Clifford · John W Cassidy -
2019 : Poster session »
Sebastian Farquhar · Erik Daxberger · Andreas Look · Matt Benatan · Ruiyi Zhang · Marton Havasi · Fredrik Gustafsson · James A Brofos · Nabeel Seedat · Micha Livne · Ivan Ustyuzhaninov · Adam Cobb · Felix D McGregor · Patrick McClure · Tim R. Davidson · Gaurush Hiranandani · Sanjeev Arora · Masha Itkina · Didrik Nielsen · William Harvey · Matias Valdenegro-Toro · Stefano Peluchetti · Riccardo Moriconi · Tianyu Cui · Vaclav Smidl · Taylan Cemgil · Jack Fitzsimons · He Zhao · · mariana vargas vieyra · Apratim Bhattacharyya · Rahul Sharma · Geoffroy Dubourg-Felonneau · Jonathan Warrell · Slava Voloshynovskiy · Mihaela Rosca · Jiaming Song · Andrew Ross · Homa Fashandi · Ruiqi Gao · Hooshmand Shokri Razaghi · Joshua Chang · Zhenzhong Xiao · Vanessa Boehm · Giorgio Giannone · Ranganath Krishnan · Joe Davison · Arsenii Ashukha · Jeremiah Liu · Sicong (Sheldon) Huang · Evgenii Nikishin · Sunho Park · Nilesh Ahuja · Mahesh Subedar · · Artyom Gadetsky · Jhosimar Arias Figueroa · Tim G. J. Rudner · Waseem Aslam · Adrián Csiszárik · John Moberg · Ali Hebbal · Kathrin Grosse · Pekka Marttinen · Bang An · Hlynur Jónsson · Samuel Kessler · Abhishek Kumar · Mikhail Figurnov · Omesh Tickoo · Steindor Saemundsson · Ari Heljakka · Dániel Varga · Niklas Heim · Simone Rossi · Max Laves · Waseem Gharbieh · Nicholas Roberts · Luis Armando Pérez Rey · Matthew Willetts · Prithvijit Chakrabarty · Sumedh Ghaisas · Carl Shneider · Wray Buntine · Kamil Adamczewski · Xavier Gitiaux · Suwen Lin · Hao Fu · Gunnar Rätsch · Aidan Gomez · Erik Bodin · Dinh Phung · Lennart Svensson · Juliano Tusi Amaral Laganá Pinto · Milad Alizadeh · Jianzhun Du · Kevin Murphy · Beatrix Benkő · Shashaank Vattikuti · Jonathan Gordon · Christopher Kanan · Sontje Ihler · Darin Graham · Michael Teng · Louis Kirsch · Tomas Pevny · Taras Holotyak -
2019 Poster: Fast Convergence of Natural Gradient Descent for Over-Parameterized Neural Networks »
Guodong Zhang · James Martens · Roger Grosse -
2019 Poster: Which Algorithmic Choices Matter at Which Batch Sizes? Insights From a Noisy Quadratic Model »
Guodong Zhang · Lala Li · Zachary Nado · James Martens · Sushant Sachdeva · George Dahl · Chris Shallue · Roger Grosse -
2019 Poster: Preventing Gradient Attenuation in Lipschitz Constrained Convolutional Networks »
Qiyang Li · Saminul Haque · Cem Anil · James Lucas · Roger Grosse · Joern-Henrik Jacobsen -
2019 Poster: Don't Blame the ELBO! A Linear VAE Perspective on Posterior Collapse »
James Lucas · George Tucker · Roger Grosse · Mohammad Norouzi -
2018 : Incremental Few-Shot Learning with Attention Attractor Networks »
Mengye Ren -
2018 : Poster spotlight session. »
Abdullah Salama · Wei-Cheng Chang · Aidan Gomez · Raphael Tang · FUXUN YU · Zhendong Zhang · Yuxin Zhang · Ji Lin · Stephen Tiedemann · Kun Bai · Sivaramakrishnan Sankarapandian · Marton Havasi · Jack Turner · Hsin-Pai Cheng · Yue Wang · Xiaofan Xu · Ruizhou Ding · Haoji Hu · Mohammad Shafiee · Christopher Blake · Chieh-Chi Kao · Daniel Kang · Yew Ken Chia · Amir Ashouri · Sourya Basu · Simon Wiedemann · Thorsten Laude -
2018 Poster: Isolating Sources of Disentanglement in Variational Autoencoders »
Tian Qi Chen · Xuechen (Chen) Li · Roger Grosse · David Duvenaud -
2018 Oral: Isolating Sources of Disentanglement in Variational Autoencoders »
Tian Qi Chen · Xuechen (Chen) Li · Roger Grosse · David Duvenaud -
2018 Poster: Neural Guided Constraint Logic Programming for Program Synthesis »
Lisa Zhang · Gregory Rosenblatt · Ethan Fetaya · Renjie Liao · William Byrd · Matthew Might · Raquel Urtasun · Richard Zemel -
2018 Poster: Reversible Recurrent Neural Networks »
Matthew MacKay · Paul Vicol · Jimmy Ba · Roger Grosse -
2017 : Machine Learning for Self-Driving Cars, Raquel Urtasun, Uber ATG and University of Toronto »
Raquel Urtasun -
2017 : Raquel Urtasun: Deep Learning for Self-Driving Cars »
Raquel Urtasun -
2017 Poster: Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation »
Yuhuai Wu · Elman Mansimov · Roger Grosse · Shun Liao · Jimmy Ba -
2017 Poster: Attention is All you Need »
Ashish Vaswani · Noam Shazeer · Niki Parmar · Jakob Uszkoreit · Llion Jones · Aidan Gomez · Łukasz Kaiser · Illia Polosukhin -
2017 Spotlight: Attention is All you Need »
Ashish Vaswani · Noam Shazeer · Niki Parmar · Jakob Uszkoreit · Llion Jones · Aidan Gomez · Łukasz Kaiser · Illia Polosukhin -
2017 Spotlight: Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation »
Yuhuai Wu · Elman Mansimov · Roger Grosse · Shun Liao · Jimmy Ba -
2017 Poster: Few-Shot Learning Through an Information Retrieval Lens »
Eleni Triantafillou · Richard Zemel · Raquel Urtasun -
2016 : Raquel Urtasun »
Raquel Urtasun -
2016 : Invited Talk - TorontoCity Benchmark: Towards Building Large Scale 3D Models of the World »
Raquel Urtasun -
2016 : Invited Talk: Towards Affordable Self-driving Cars (Raquel Urtasun, University of Toronto) »
Raquel Urtasun -
2016 Symposium: Deep Learning Symposium »
Yoshua Bengio · Yann LeCun · Navdeep Jaitly · Roger Grosse -
2016 Poster: Proximal Deep Structured Models »
Shenlong Wang · Sanja Fidler · Raquel Urtasun -
2016 Poster: Measuring the reliability of MCMC inference with bidirectional Monte Carlo »
Roger Grosse · Siddharth Ancha · Daniel Roy -
2016 Poster: Understanding the Effective Receptive Field in Deep Convolutional Neural Networks »
Wenjie Luo · Yujia Li · Raquel Urtasun · Richard Zemel -
2016 Poster: Learning Deep Parsimonious Representations »
Renjie Liao · Alex Schwing · Richard Zemel · Raquel Urtasun -
2015 Poster: Skip-Thought Vectors »
Jamie Kiros · Yukun Zhu · Russ Salakhutdinov · Richard Zemel · Raquel Urtasun · Antonio Torralba · Sanja Fidler -
2015 Poster: Learning Wake-Sleep Recurrent Attention Models »
Jimmy Ba · Russ Salakhutdinov · Roger Grosse · Brendan J Frey -
2015 Spotlight: Learning Wake-Sleep Recurrent Attention Models »
Jimmy Ba · Russ Salakhutdinov · Roger Grosse · Brendan J Frey -
2015 Poster: 3D Object Proposals for Accurate Object Class Detection »
Xiaozhi Chen · Kaustav Kundu · Yukun Zhu · Andrew G Berneshawi · Huimin Ma · Sanja Fidler · Raquel Urtasun -
2015 Poster: Exploring Models and Data for Image Question Answering »
Mengye Ren · Jamie Kiros · Richard Zemel -
2014 Poster: Efficient Inference of Continuous Markov Random Fields with Polynomial Potentials »
Shenlong Wang · Alex Schwing · Raquel Urtasun -
2014 Poster: Message Passing Inference for Large Scale Graphical Models with High Order Potentials »
Jian Zhang · Alex Schwing · Raquel Urtasun -
2013 Poster: Annealing between distributions by averaging moments »
Roger Grosse · Chris Maddison · Russ Salakhutdinov -
2013 Oral: Annealing between distributions by averaging moments »
Roger Grosse · Chris Maddison · Russ Salakhutdinov -
2013 Poster: Latent Structured Active Learning »
Wenjie Luo · Alex Schwing · Raquel Urtasun -
2012 Poster: 3D Object Detection and Viewpoint Estimation with a Deformable 3D Cuboid Model »
Sanja Fidler · Sven Dickinson · Raquel Urtasun -
2012 Poster: Globally Convergent Dual MAP LP Relaxation Solvers using Fenchel-Young Margins »
Alex Schwing · Tamir Hazan · Marc Pollefeys · Raquel Urtasun -
2012 Spotlight: 3D Object Detection and Viewpoint Estimation with a Deformable 3D Cuboid Model »
Sanja Fidler · Sven Dickinson · Raquel Urtasun -
2012 Session: Oral Session 1 »
Raquel Urtasun -
2011 Session: Spotlight Session 5 »
Raquel Urtasun -
2011 Session: Oral Session 6 »
Raquel Urtasun -
2011 Poster: Learning Probabilistic Non-Linear Latent Variable Models for Tracking Complex Activities »
Angela Yao · Juergen Gall · Luc V Gool · Raquel Urtasun -
2011 Poster: Joint 3D Estimation of Objects and Scene Layout »
Andreas Geiger · Christian Wojek · Raquel Urtasun -
2010 Poster: Sparse Coding for Learning Interpretable Spatio-Temporal Primitives »
Taehwan Kim · Greg Shakhnarovich · Raquel Urtasun -
2010 Session: Spotlights Session 6 »
Raquel Urtasun -
2010 Session: Oral Session 7 »
Raquel Urtasun -
2010 Poster: Implicitly Constrained Gaussian Process Regression for Monocular Non-Rigid Pose Estimation »
Mathieu Salzmann · Raquel Urtasun -
2010 Poster: A Primal-Dual Message-Passing Algorithm for Approximated Large Scale Structured Prediction »
Tamir Hazan · Raquel Urtasun