Timezone: »

 
Poster
ZO-AdaMM: Zeroth-Order Adaptive Momentum Method for Black-Box Optimization
Xiangyi Chen · Sijia Liu · Kaidi Xu · Xingguo Li · Xue Lin · Mingyi Hong · David Cox

Tue Dec 10 10:45 AM -- 12:45 PM (PST) @ East Exhibition Hall B + C #16
The adaptive momentum method (AdaMM), which uses past gradients to update descent directions and learning rates simultaneously, has become one of the most popular first-order optimization methods for solving machine learning problems. However, AdaMM is not suited for solving black-box optimization problems, where explicit gradient forms are difficult or infeasible to obtain. In this paper, we propose a zeroth-order AdaMM (ZO-AdaMM) algorithm, that generalizes AdaMM to the gradient-free regime. We show that the convergence rate of ZO-AdaMM for both convex and nonconvex optimization is roughly a factor of $O(\sqrt{d})$ worse than that of the first-order AdaMM algorithm, where $d$ is problem size. In particular, we provide a deep understanding on why Mahalanobis distance matters in convergence of ZO-AdaMM and other AdaMM-type methods. As a byproduct, our analysis makes the first step toward understanding adaptive learning rate methods for nonconvex constrained optimization.Furthermore, we demonstrate two applications, designing per-image and universal adversarial attacks from black-box neural networks, respectively. We perform extensive experiments on ImageNet and empirically show that ZO-AdaMM converges much faster to a solution of high accuracy compared with $6$ state-of-the-art ZO optimization methods.

Author Information

Xiangyi Chen (University of Minnesota)
Sijia Liu (MIT-IBM Watson AI Lab, IBM Research AI)

Sijia Liu received the Ph.D. degree in electrical engineering from Syracuse University in 2016. From 2016 to 2017, he was a postdoctoral research fellow in Department of Electrical and Computer Engineering, University of Michigan, Ann Arbor, MI. He is currently a research staff member in IBM Research, MIT-IBM Watson AI Lab. His research interests include machine learning algorithms and statistical signal processing.

Kaidi Xu (Northeastern University)
Xingguo Li (Princeton University)
Xue Lin (Northeastern University)
Mingyi Hong (University of Minnesota)
David Cox (MIT-IBM Watson AI Lab)

More from the Same Authors