Timezone: »

On the Convergence Theory for Hessian-Free Bilevel Algorithms
Daouda Sow · Kaiyi Ji · Yingbin Liang

Wed Nov 30 02:00 PM -- 04:00 PM (PST) @ Hall J #221

Bilevel optimization has arisen as a powerful tool in modern machine learning. However, due to the nested structure of bilevel optimization, even gradient-based methods require second-order derivative approximations via Jacobian- or/and Hessian-vector computations, which can be costly and unscalable in practice. Recently, Hessian-free bilevel schemes have been proposed to resolve this issue, where the general idea is to use zeroth- or first-order methods to approximate the full hypergradient of the bilevel problem. However, we empirically observe that such approximation can lead to large variance and unstable training, but estimating only the response Jacobian matrix as a partial component of the hypergradient turns out to be extremely effective. To this end, we propose a new Hessian-free method, which adopts the zeroth-order-like method to approximate the response Jacobian matrix via taking difference between two optimization paths. Theoretically, we provide the convergence rate analysis for the proposed algorithms, where our key challenge is to characterize the approximation and smoothness properties of the trajectory-dependent estimator, which can be of independent interest. This is the first known convergence rate result for this type of Hessian-free bilevel algorithms. Experimentally, we demonstrate that the proposed algorithms outperform baseline bilevel optimizers on various bilevel problems. Particularly, in our experiment on few-shot meta-learning with ResNet-12 network over the miniImageNet dataset, we show that our algorithm outperforms baseline meta-learning algorithms, while other baseline bilevel optimizers do not solve such meta-learning problems within a comparable time frame.

Author Information

Daouda Sow (The Ohio State University)
Kaiyi Ji (University at Buffalo)

Kaiyi Ji is now an assistant professor at the Department of Computer Science and Engineering of the University at Buffalo. He was a postdoctoral research fellow at the Electrical Engineering and Computer Science Department of the University of Michigan, Ann Arbor, in 2022, working with Prof. Lei Ying. He received his Ph.D. degree from the Electrical and Computer Engineering Department of The Ohio State University in December, 2021, advised by Prof. Yingbin Liang. He was a visiting student research collaborator at the department of Electrical Engineering, Princeton University working with Prof. H. Vincent Poor. Previously he obtained his B.S. degree from University of Science and Technology of China in 2016.

Yingbin Liang (The Ohio State University)

More from the Same Authors