Timezone: »
Bilevel optimization has arisen as a powerful tool for solving a variety of machine learning problems. Two current popular bilevel optimizers AID-BiO and ITD-BiO naturally involve solving one or two sub-problems, and consequently, whether we solve these problems with loops (that take many iterations) or without loops (that take only a few iterations) can significantly affect the overall computational efficiency. Existing studies in the literature cover only some of those implementation choices, and the complexity bounds available are not refined enough to enable rigorous comparison among different implementations. In this paper, we first establish unified convergence analysis for both AID-BiO and ITD-BiO that are applicable to all implementation choices of loops. We then specialize our results to characterize the computational complexity for all implementations, which enable an explicit comparison among them. Our result indicates that for AID-BiO, the loop for estimating the optimal point of the inner function is beneficial for overall efficiency, although it causes higher complexity for each update step, and the loop for approximating the outer-level Hessian-inverse-vector product reduces the gradient complexity. For ITD-BiO, the two loops always coexist, and our convergence upper and lower bounds show that such loops are necessary to guarantee a vanishing convergence error, whereas the no-loop scheme suffers from an unavoidable non-vanishing convergence error. Our numerical experiments further corroborate our theoretical results.
Author Information
Kaiyi Ji (University at Buffalo)
Kaiyi Ji is now an assistant professor at the Department of Computer Science and Engineering of the University at Buffalo. He was a postdoctoral research fellow at the Electrical Engineering and Computer Science Department of the University of Michigan, Ann Arbor, in 2022, working with Prof. Lei Ying. He received his Ph.D. degree from the Electrical and Computer Engineering Department of The Ohio State University in December, 2021, advised by Prof. Yingbin Liang. He was a visiting student research collaborator at the department of Electrical Engineering, Princeton University working with Prof. H. Vincent Poor. Previously he obtained his B.S. degree from University of Science and Technology of China in 2016.
Mingrui Liu (George Mason University)
Yingbin Liang (The Ohio State University)
Lei Ying (University of Michigan, Ann Arbor)
Related Events (a corresponding poster, oral, or spotlight)
-
2022 Poster: Will Bilevel Optimizers Benefit from Loops »
Tue. Nov 29th 05:00 -- 07:00 PM Room Hall J #808
More from the Same Authors
-
2021 Spotlight: Provably Faster Algorithms for Bilevel Optimization »
Junjie Yang · Kaiyi Ji · Yingbin Liang -
2022 Poster: Provable Generalization of Overparameterized Meta-learning Trained with SGD »
Yu Huang · Yingbin Liang · Longbo Huang -
2022 : Online Min-max Optimization: Nonconvexity, Nonstationarity, and Dynamic Regret »
Yu Huang · Yuan Cheng · Yingbin Liang · Longbo Huang -
2023 Poster: Provably Efficient Algorithm for Nonstationary Low-Rank MDPs »
Yuan Cheng · Jing Yang · Yingbin Liang -
2023 Poster: Global Convergence Analysis of Local SGD for One-hidden-layer Convolutional Neural Network without Overparameterization »
Yajie Bao · Amarda Shehu · Mingrui Liu -
2023 Poster: SimFBO: Towards Simple, Flexible and Communication-efficient Federated Bilevel Learning »
Yifan Yang · Peiyao Xiao · Kaiyi Ji -
2023 Poster: Federated Learning with Client Subsampling, Data Heterogeneity, and Unbounded Smoothness: A New Algorithm and Lower Bounds »
Michael Crawshaw · Yajie Bao · Mingrui Liu -
2023 Poster: Fast and Regret Optimal Best Arm Identification: Fundamental Limits and Low-Complexity Algorithms »
Qining Zhang · Lei Ying -
2023 Poster: Non-Convex Bilevel Optimization with Time-Varying Objective Functions »
Sen Lin · Daouda Sow · Kaiyi Ji · Yingbin Liang · Ness Shroff -
2023 Poster: Achieving Near-optimal Complexity in Hessian-free Stochastic Bilevel Optimization »
Yifan Yang · Peiyao Xiao · Kaiyi Ji -
2023 Poster: Sample Efficient Reinforcement Learning in Mixed Systems through Augmented Samples and Its Applications to Queueing Networks »
Honghao Wei · Xin Liu · Weina Wang · Lei Ying -
2023 Poster: Direction-oriented Multi-objective Learning: Simple and Provable Stochastic Algorithms »
Peiyao Xiao · Hao Ban · Kaiyi Ji -
2023 Poster: Bilevel Coreset Selection in Continual Learning: A New Formulation and Algorithm »
Jie Hao · Kaiyi Ji · Mingrui Liu -
2022 Spotlight: A Communication-Efficient Distributed Gradient Clipping Algorithm for Training Deep Neural Networks »
Mingrui Liu · Zhenxun Zhuang · Yunwen Lei · Chunyang Liao -
2022 Spotlight: Lightning Talks 3B-2 »
Yu Huang · Tero Karras · Maxim Kodryan · Shiau Hong Lim · Shudong Huang · Ziyu Wang · Siqiao Xue · ILYAS MALIK · Ekaterina Lobacheva · Miika Aittala · Hongjie Wu · Yuhao Zhou · Yingbin Liang · Xiaoming Shi · Jun Zhu · Maksim Nakhodnov · Timo Aila · Yazhou Ren · James Zhang · Longbo Huang · Dmitry Vetrov · Ivor Tsang · Hongyuan Mei · Samuli Laine · Zenglin Xu · Wentao Feng · Jiancheng Lv -
2022 Spotlight: Provable Generalization of Overparameterized Meta-learning Trained with SGD »
Yu Huang · Yingbin Liang · Longbo Huang -
2022 Spotlight: Lightning Talks 1A-3 »
Kimia Noorbakhsh · Ronan Perry · Qi Lyu · Jiawei Jiang · Christian Toth · Olivier Jeunen · Xin Liu · Yuan Cheng · Lei Li · Manuel Rodriguez · Julius von Kügelgen · Lars Lorch · Nicolas Donati · Lukas Burkhalter · Xiao Fu · Zhongdao Wang · Songtao Feng · Ciarán Gilligan-Lee · Rishabh Mehrotra · Fangcheng Fu · Jing Yang · Bernhard Schölkopf · Ya-Li Li · Christian Knoll · Maks Ovsjanikov · Andreas Krause · Shengjin Wang · Hong Zhang · Mounia Lalmas · Bolin Ding · Bo Du · Yingbin Liang · Franz Pernkopf · Robert Peharz · Anwar Hithnawi · Julius von Kügelgen · Bo Li · Ce Zhang -
2022 Spotlight: Provable Benefit of Multitask Representation Learning in Reinforcement Learning »
Yuan Cheng · Songtao Feng · Jing Yang · Hong Zhang · Yingbin Liang -
2022 Poster: A Unifying Framework of Off-Policy General Value Function Evaluation »
Tengyu Xu · Zhuoran Yang · Zhaoran Wang · Yingbin Liang -
2022 Poster: A Communication-Efficient Distributed Gradient Clipping Algorithm for Training Deep Neural Networks »
Mingrui Liu · Zhenxun Zhuang · Yunwen Lei · Chunyang Liao -
2022 Poster: Robustness to Unbounded Smoothness of Generalized SignSGD »
Michael Crawshaw · Mingrui Liu · Francesco Orabona · Wei Zhang · Zhenxun Zhuang -
2022 Poster: On the Convergence Theory for Hessian-Free Bilevel Algorithms »
Daouda Sow · Kaiyi Ji · Yingbin Liang -
2022 Poster: Online Convex Optimization with Hard Constraints: Towards the Best of Two Worlds and Beyond »
Hengquan Guo · Xin Liu · Honghao Wei · Lei Ying -
2022 Poster: Provable Benefit of Multitask Representation Learning in Reinforcement Learning »
Yuan Cheng · Songtao Feng · Jing Yang · Hong Zhang · Yingbin Liang -
2021 Poster: Faster Non-asymptotic Convergence for Double Q-learning »
Lin Zhao · Huaqing Xiong · Yingbin Liang -
2021 Poster: Generalization Guarantee of SGD for Pairwise Learning »
Yunwen Lei · Mingrui Liu · Yiming Ying -
2021 Poster: Provably Faster Algorithms for Bilevel Optimization »
Junjie Yang · Kaiyi Ji · Yingbin Liang -
2021 Poster: An Efficient Pessimistic-Optimistic Algorithm for Stochastic Linear Bandits with General Constraints »
Xin Liu · Bin Li · Pengyi Shi · Lei Ying -
2020 Poster: Convergence of Meta-Learning with Task-Specific Adaptation over Partial Parameters »
Kaiyi Ji · Jason Lee · Yingbin Liang · H. Vincent Poor -
2020 Poster: Improved Schemes for Episodic Memory-based Lifelong Learning »
Yunhui Guo · Mingrui Liu · Tianbao Yang · Tajana S Rosing -
2020 Spotlight: Improved Schemes for Episodic Memory-based Lifelong Learning »
Yunhui Guo · Mingrui Liu · Tianbao Yang · Tajana S Rosing -
2020 Poster: A Decentralized Parallel Algorithm for Training Generative Adversarial Nets »
Mingrui Liu · Wei Zhang · Youssef Mroueh · Xiaodong Cui · Jarret Ross · Tianbao Yang · Payel Das -
2020 Poster: Improving Sample Complexity Bounds for (Natural) Actor-Critic Algorithms »
Tengyu Xu · Zhe Wang · Yingbin Liang -
2020 Poster: Finite-Time Analysis for Double Q-learning »
Huaqing Xiong · Lin Zhao · Yingbin Liang · Wei Zhang -
2020 Spotlight: Finite-Time Analysis for Double Q-learning »
Huaqing Xiong · Lin Zhao · Yingbin Liang · Wei Zhang -
2020 Poster: The Mean-Squared Error of Double Q-Learning »
Wentao Weng · Harsh Gupta · Niao He · Lei Ying · R. Srikant -
2019 Poster: Finite-Time Performance Bounds and Adaptive Learning Rate Selection for Two Time-Scale Reinforcement Learning »
Harsh Gupta · R. Srikant · Lei Ying -
2019 Poster: SpiderBoost and Momentum: Faster Variance Reduction Algorithms »
Zhe Wang · Kaiyi Ji · Yi Zhou · Yingbin Liang · Vahid Tarokh -
2019 Poster: Finite-Sample Analysis for SARSA with Linear Function Approximation »
Shaofeng Zou · Tengyu Xu · Yingbin Liang -
2019 Poster: Two Time-scale Off-Policy TD Learning: Non-asymptotic Analysis over Markovian Samples »
Tengyu Xu · Shaofeng Zou · Yingbin Liang -
2018 Poster: Convergence of Cubic Regularization for Nonconvex Optimization under KL Property »
Yi Zhou · Zhe Wang · Yingbin Liang -
2018 Spotlight: Convergence of Cubic Regularization for Nonconvex Optimization under KL Property »
Yi Zhou · Zhe Wang · Yingbin Liang -
2018 Poster: Minimax Estimation of Neural Net Distance »
Kaiyi Ji · Yingbin Liang