Timezone: »
Poster
Closing the gap between the upper bound and lower bound of Adam's iteration complexity
Bohan Wang · Jingwen Fu · Huishuai Zhang · Nanning Zheng · Wei Chen
Recently, Arjevani et al. [1] establish a lower bound of iteration complexity for the first-order optimization under an $L$-smooth condition and a bounded noise variance assumption. However, a thorough review of existing literature on Adam's convergence reveals a noticeable gap: none of them meet the above lower bound. In this paper, we close the gap by deriving a new convergence guarantee of Adam, with only an $L$-smooth condition and a bounded noise variance assumption. Our results remain valid across a broad spectrum of hyperparameters. Especially with properly chosen hyperparameters, we derive an upper bound of the iteration complexity of Adam and show that it meets the lower bound for first-order optimizers. To the best of our knowledge, this is the first to establish such a tight upper bound for Adam's convergence. Our proof utilizes novel techniques to handle the entanglement between momentum and adaptive learning rate and to convert the first-order term in the Descent Lemma to the gradient norm, which may be of independent interest.
Author Information
Bohan Wang (USTC)
Jingwen Fu (Xi'an Jiaotong University)
Huishuai Zhang (Microsoft Research Asia)
Nanning Zheng (Xi'an Jiaotong University)
Wei Chen ( Chinese Academy of Sciences)
More from the Same Authors
-
2022 Poster: Could Giant Pre-trained Image Models Extract Universal Representations? »
Yutong Lin · Ze Liu · Zheng Zhang · Han Hu · Nanning Zheng · Stephen Lin · Yue Cao -
2023 : Training Private and Efficient Language Models with Synthetic Data from LLMs »
Da Yu · Arturs Backurs · Sivakanth Gopi · Huseyin A. Inan · Janardhan Kulkarni · Zinan Lin · Chulin Xie · Huishuai Zhang · Wanrong Zhang -
2023 : Large Catapults in Momentum Gradient Descent with Warmup: An Empirical Study »
Prin Phunyaphibarn · Junghyun Lee · Bohan Wang · Huishuai Zhang · Chulhee Yun -
2023 : Large Catapults in Momentum Gradient Descent with Warmup: An Empirical Study »
Prin Phunyaphibarn · Junghyun Lee · Bohan Wang · Huishuai Zhang · Chulhee Yun -
2023 Poster: FD-Align: Feature Discrimination Alignment for Fine-tuning Pre-Trained Models in Few-Shot Learning »
Kun Song · Huimin Ma · Bochao Zou · Huishuai Zhang · Weiran Huang -
2023 Poster: Learning Trajectories are Generalization Indicators »
Jingwen Fu · Zhizheng Zhang · Dacheng Yin · Yan Lu · Nanning Zheng -
2023 Poster: DiffKendall: A Novel Approach for Few-Shot Learning with Differentiable Kendall's Rank Correlation »
Kaipeng Zheng · Huishuai Zhang · Weiran Huang -
2023 Poster: On the Trade-off of Intra-/Inter-class Diversity for Supervised Pre-training »
Jieyu Zhang · Bohan Wang · Zhengyu Hu · Pang Wei Koh · Alexander Ratner -
2023 Poster: Geometric Transformer with Interatomic Positional Encoding »
Yusong Wang · Shaoning Li · Tong Wang · Bin Shao · Nanning Zheng · Tie-Yan Liu -
2023 Poster: DisDiff: Unsupervised Disentanglement of Diffusion Probabilistic Models »
Tao Yang · Yuwang Wang · Yan Lu · Nanning Zheng -
2023 Poster: Fast Conditional Mixing of MCMC Algorithms for Non-log-concave Distributions »
Xiang Cheng · Bohan Wang · Jingzhao Zhang · Yusong Zhu -
2023 Poster: On the Generalization Properties of Diffusion Models »
Puheng Li · Zhong Li · Huishuai Zhang · Jiang Bian -
2022 Spotlight: Lightning Talks 5B-4 »
Yuezhi Yang · Zeyu Yang · Yong Lin · Yishi Xu · Linan Yue · Tao Yang · Weixin Chen · Qi Liu · Jiaqi Chen · Dongsheng Wang · Baoyuan Wu · Yuwang Wang · Hao Pan · Shengyu Zhu · Zhenwei Miao · Yan Lu · Lu Tan · Bo Chen · Yichao Du · Haoqian Wang · Wei Li · Yanqing An · Ruiying Lu · Peng Cui · Nanning Zheng · Li Wang · Zhibin Duan · Xiatian Zhu · Mingyuan Zhou · Enhong Chen · Li Zhang -
2022 Spotlight: Visual Concepts Tokenization »
Tao Yang · Yuwang Wang · Yan Lu · Nanning Zheng -
2022 Spotlight: Lightning Talks 4A-3 »
Zhihan Gao · Yabin Wang · Xingyu Qu · Luziwei Leng · Mingqing Xiao · Bohan Wang · Yu Shen · Zhiwu Huang · Xingjian Shi · Qi Meng · Yupeng Lu · Diyang Li · Qingyan Meng · Kaiwei Che · Yang Li · Hao Wang · Huishuai Zhang · Zongpeng Zhang · Kaixuan Zhang · Xiaopeng Hong · Xiaohan Zhao · Di He · Jianguo Zhang · Yaofeng Tu · Bin Gu · Yi Zhu · Ruoyu Sun · Yuyang (Bernie) Wang · Zhouchen Lin · Qinghu Meng · Wei Chen · Wentao Zhang · Bin CUI · Jie Cheng · Zhi-Ming Ma · Mu Li · Qinghai Guo · Dit-Yan Yeung · Tie-Yan Liu · Jianxing Liao -
2022 Spotlight: Does Momentum Change the Implicit Regularization on Separable Data? »
Bohan Wang · Qi Meng · Huishuai Zhang · Ruoyu Sun · Wei Chen · Zhi-Ming Ma · Tie-Yan Liu -
2022 Spotlight: Lightning Talks 2A-3 »
David Buterez · Chengan He · Xuan Kan · Yutong Lin · Konstantin Schürholt · Yu Yang · Louis Annabi · Wei Dai · Xiaotian Cheng · Alexandre Pitti · Ze Liu · Jon Paul Janet · Jun Saito · Boris Knyazev · Mathias Quoy · Zheng Zhang · James Zachary · Steven J Kiddle · Xavier Giro-i-Nieto · Chang Liu · Hejie Cui · Zilong Zhang · Hakan Bilen · Damian Borth · Dino Oglic · Holly Rushmeier · Han Hu · Xiangyang Ji · Yi Zhou · Nanning Zheng · Ying Guo · Pietro Liò · Stephen Lin · Carl Yang · Yue Cao -
2022 Spotlight: Could Giant Pre-trained Image Models Extract Universal Representations? »
Yutong Lin · Ze Liu · Zheng Zhang · Han Hu · Nanning Zheng · Stephen Lin · Yue Cao -
2022 Poster: Does Momentum Change the Implicit Regularization on Separable Data? »
Bohan Wang · Qi Meng · Huishuai Zhang · Ruoyu Sun · Wei Chen · Zhi-Ming Ma · Tie-Yan Liu -
2022 Poster: Visual Concepts Tokenization »
Tao Yang · Yuwang Wang · Yan Lu · Nanning Zheng -
2021 Poster: Optimizing Information-theoretical Generalization Bound via Anisotropic Noise of SGLD »
Bohan Wang · Huishuai Zhang · Jieyu Zhang · Qi Meng · Wei Chen · Tie-Yan Liu -
2021 Poster: Co-evolution Transformer for Protein Contact Prediction »
He Zhang · Fusong Ju · Jianwei Zhu · Liang He · Bin Shao · Nanning Zheng · Tie-Yan Liu -
2021 Poster: Dynamic Grained Encoder for Vision Transformers »
Lin Song · Songyang Zhang · Songtao Liu · Zeming Li · Xuming He · Hongbin Sun · Jian Sun · Nanning Zheng -
2021 Poster: Instance-Conditional Knowledge Distillation for Object Detection »
Zijian Kang · Peizhen Zhang · Xiangyu Zhang · Jian Sun · Nanning Zheng -
2020 Poster: Compositional Generalization by Learning Analytical Expressions »
Qian Liu · Shengnan An · Jian-Guang Lou · Bei Chen · Zeqi Lin · Yan Gao · Bin Zhou · Nanning Zheng · Dongmei Zhang -
2020 Spotlight: Compositional Generalization by Learning Analytical Expressions »
Qian Liu · Shengnan An · Jian-Guang Lou · Bei Chen · Zeqi Lin · Yan Gao · Bin Zhou · Nanning Zheng · Dongmei Zhang -
2020 Poster: Rethinking Learnable Tree Filter for Generic Feature Transform »
Lin Song · Yanwei Li · Zhengkai Jiang · Zeming Li · Xiangyu Zhang · Hongbin Sun · Jian Sun · Nanning Zheng -
2020 Poster: Fine-Grained Dynamic Head for Object Detection »
Lin Song · Yanwei Li · Zhengkai Jiang · Zeming Li · Hongbin Sun · Jian Sun · Nanning Zheng -
2019 : Break / Poster Session 1 »
Antonia Marcu · Yao-Yuan Yang · Pascale Gourdeau · Chen Zhu · Thodoris Lykouris · Jianfeng Chi · Mark Kozdoba · Arjun Nitin Bhagoji · Xiaoxia Wu · Jay Nandy · Michael T Smith · Bingyang Wen · Yuege Xie · Konstantinos Pitas · Suprosanna Shit · Maksym Andriushchenko · Dingli Yu · Gaël Letarte · Misha Khodak · Hussein Mozannar · Chara Podimata · James Foulds · Yizhen Wang · Huishuai Zhang · Ondrej Kuzelka · Alexander Levine · Nan Lu · Zakaria Mhammedi · Paul Viallard · Diana Cai · Lovedeep Gondara · James Lucas · Yasaman Mahdaviyeh · Aristide Baratin · Rishi Bommasani · Alessandro Barp · Andrew Ilyas · Kaiwen Wu · Jens Behrmann · Omar Rivasplata · Amir Nazemi · Aditi Raghunathan · Will Stephenson · Sahil Singla · Akhil Gupta · YooJung Choi · Yannic Kilcher · Clare Lyle · Edoardo Manino · Andrew Bennett · Zhi Xu · Niladri Chatterji · Emre Barut · Flavien Prost · Rodrigo Toro Icarte · Arno Blaas · Chulhee Yun · Sahin Lale · YiDing Jiang · Tharun Kumar Reddy Medini · Ashkan Rezaei · Alexander Meinke · Stephen Mell · Gary Kazantsev · Shivam Garg · Aradhana Sinha · Vishnu Lokhande · Geovani Rizk · Han Zhao · Aditya Kumar Akash · Jikai Hou · Ali Ghodsi · Matthias Hein · Tyler Sypherd · Yichen Yang · Anastasia Pentina · Pierre Gillot · Antoine Ledent · Guy Gur-Ari · Noah MacAulay · Tianzong Zhang -
2019 Poster: Learnable Tree Filter for Structure-preserving Feature Transform »
Lin Song · Yanwei Li · Zeming Li · Gang Yu · Hongbin Sun · Jian Sun · Nanning Zheng -
2018 Poster: On the Local Hessian in Back-propagation »
Huishuai Zhang · Wei Chen · Tie-Yan Liu