Timezone: »
Anderson mixing (AM) is a useful method that can accelerate fixed-point iterations by exploring the information from historical iterations. Despite its numerical success in various applications, the memory requirement in AM remains a bottleneck when solving large-scale optimization problems in a resource-limited machine. To address this problem, we propose a novel variant of AM method, called Min-AM, by storing only one vector pair, that is the minimal memory size requirement in AM. Our method forms a symmetric approximation to the inverse Hessian matrix and is proved to be equivalent to the full-memory Type-I AM for solving strongly convex quadratic optimization. Moreover, for general nonlinear optimization problems, we establish the convergence properties of Min-AM under reasonable assumptions and show that the mixing parameters can be adaptively chosen by estimating the eigenvalues of the Hessian. Finally, we extend Min-AM to solve stochastic programming problems. Experimental results on logistic regression and network training problems validate the effectiveness of the proposed Min-AM.
Author Information
Fuchao Wei (Tsinghua University, Tsinghua University)
Chenglong Bao (Tsinghua university)
Yang Liu (Tsinghua University)
Guangwen Yang (Tsinghua University)
More from the Same Authors
-
2022 Poster: Molecule Generation by Principal Subgraph Mining and Assembling »
Xiangzhe Kong · Wenbing Huang · Zhixing Tan · Yang Liu -
2022 Poster: A Closer Look at the Adversarial Robustness of Deep Equilibrium Models »
Zonghan Yang · Tianyu Pang · Yang Liu -
2021 Poster: Stochastic Anderson Mixing for Nonconvex Stochastic Optimization »
Fuchao Wei · Chenglong Bao · Yang Liu -
2021 Poster: AFEC: Active Forgetting of Negative Transfer in Continual Learning »
Liyuan Wang · Mingtian Zhang · Zhongfan Jia · Qian Li · Chenglong Bao · Kaisheng Ma · Jun Zhu · Yi Zhong -
2020 Poster: Task-Oriented Feature Distillation »
Linfeng Zhang · Yukang Shi · Zuoqiang Shi · Kaisheng Ma · Chenglong Bao -
2020 Poster: Model-based Adversarial Meta-Reinforcement Learning »
Zichuan Lin · Garrett Thomas · Guangwen Yang · Tengyu Ma -
2020 Poster: RD$^2$: Reward Decomposition with Representation Decomposition »
Zichuan Lin · Derek Yang · Li Zhao · Tao Qin · Guangwen Yang · Tie-Yan Liu -
2019 Poster: SCAN: A Scalable Neural Networks Framework Towards Compact and Efficient Models »
Linfeng Zhang · Zhanhong Tan · Jiebo Song · Jingwei Chen · Chenglong Bao · Kaisheng Ma -
2019 Poster: Distributional Reward Decomposition for Reinforcement Learning »
Zichuan Lin · Li Zhao · Derek Yang · Tao Qin · Tie-Yan Liu · Guangwen Yang