Timezone: »
Poster
ZO-AdaMM: Zeroth-Order Adaptive Momentum Method for Black-Box Optimization
Xiangyi Chen · Sijia Liu · Kaidi Xu · Xingguo Li · Xue Lin · Mingyi Hong · David Cox
Tue Dec 10 10:45 AM -- 12:45 PM (PST) @ East Exhibition Hall B + C #16
The adaptive momentum method (AdaMM), which uses past gradients to update descent directions and learning rates simultaneously, has become one of the most popular first-order optimization methods for solving machine learning problems. However, AdaMM is not suited for solving black-box optimization problems, where explicit gradient forms are difficult or infeasible to obtain. In this paper, we propose a zeroth-order AdaMM (ZO-AdaMM) algorithm, that generalizes AdaMM to the gradient-free regime. We show that the convergence rate of ZO-AdaMM for both convex and nonconvex optimization is roughly a factor of $O(\sqrt{d})$ worse than that of the first-order AdaMM algorithm, where $d$ is problem size. In particular, we provide a deep understanding on why Mahalanobis distance matters in convergence of ZO-AdaMM and other AdaMM-type methods. As a byproduct, our analysis makes the first step toward understanding adaptive learning rate methods for nonconvex constrained optimization.Furthermore, we demonstrate two applications, designing per-image and universal adversarial attacks from black-box neural networks, respectively. We perform extensive experiments on ImageNet and empirically show that ZO-AdaMM converges much faster to a solution of high accuracy compared with $6$ state-of-the-art ZO optimization methods.
Author Information
Xiangyi Chen (University of Minnesota)
Sijia Liu (MIT-IBM Watson AI Lab, IBM Research AI)
Sijia Liu received the Ph.D. degree in electrical engineering from Syracuse University in 2016. From 2016 to 2017, he was a postdoctoral research fellow in Department of Electrical and Computer Engineering, University of Michigan, Ann Arbor, MI. He is currently a research staff member in IBM Research, MIT-IBM Watson AI Lab. His research interests include machine learning algorithms and statistical signal processing.
Kaidi Xu (Northeastern University)
Xingguo Li (Princeton University)
Xue Lin (Northeastern University)
Mingyi Hong (University of Minnesota)
David Cox (MIT-IBM Watson AI Lab)
More from the Same Authors
-
2020 : Paper 10: Certified Interpretability Robustness for Class Activation Mapping »
Alex Gu · Tsui-Wei Weng · Pin-Yu Chen · Sijia Liu · Luca Daniel -
2021 Spotlight: MEST: Accurate and Fast Memory-Economic Sparse Training Framework on the Edge »
Geng Yuan · Xiaolong Ma · Wei Niu · Zhengang Li · Zhenglun Kong · Ning Liu · Yifan Gong · Zheng Zhan · Chaoyang He · Qing Jin · Siyue Wang · Minghai Qin · Bin Ren · Yanzhi Wang · Sijia Liu · Xue Lin -
2021 : A Unified Framework to Understand Decentralized and Federated Optimization Algorithms: A Multi-Rate Feedback Control Perspective »
xinwei zhang · Mingyi Hong · Nicola Elia -
2022 : A Unified Framework to Understand Decentralized and Federated Optimization Algorithms: A Multi-Rate Feedback Control Perspective »
xinwei zhang · Nicola Elia · Mingyi Hong -
2022 : Building Large Machine Learning Models from Small Distributed Models: A Layer Matching Approach »
xinwei zhang · Bingqing Song · Mehrdad Honarkhah · Jie Ding · Mingyi Hong -
2022 : On the Robustness of deep learning-based MRI Reconstruction to image transformations »
jinghan jia · Mingyi Hong · Yimeng Zhang · Mehmet Akcakaya · Sijia Liu -
2022 Poster: A Stochastic Linearized Augmented Lagrangian Method for Decentralized Bilevel Optimization »
Songtao Lu · Siliang Zeng · Xiaodong Cui · Mark Squillante · Lior Horesh · Brian Kingsbury · Jia Liu · Mingyi Hong -
2022 Poster: Inducing Equilibria via Incentives: Simultaneous Design-and-Play Ensures Global Convergence »
Boyi Liu · Jiayang Li · Zhuoran Yang · Hoi-To Wai · Mingyi Hong · Yu Nie · Zhaoran Wang -
2022 Poster: Maximum-Likelihood Inverse Reinforcement Learning with Finite-Time Guarantees »
Siliang Zeng · Chenliang Li · Alfredo Garcia · Mingyi Hong -
2022 Poster: Advancing Model Pruning via Bi-level Optimization »
Yihua Zhang · Yuguang Yao · Parikshit Ram · Pu Zhao · Tianlong Chen · Mingyi Hong · Yanzhi Wang · Sijia Liu -
2022 Poster: Distributed Optimization for Overparameterized Problems: Achieving Optimal Dimension Independent Communication Complexity »
Bingqing Song · Ioannis Tsaknakis · Chung-Yiu Yau · Hoi-To Wai · Mingyi Hong -
2021 : Contributed Talk 2: A Unified Framework to Understand Decentralized and Federated Optimization Algorithms: A Multi-Rate Feedback Control Perspective »
xinwei zhang · Mingyi Hong · Nicola Elia -
2021 Poster: Beta-CROWN: Efficient Bound Propagation with Per-neuron Split Constraints for Neural Network Robustness Verification »
Shiqi Wang · Huan Zhang · Kaidi Xu · Xue Lin · Suman Jana · Cho-Jui Hsieh · J. Zico Kolter -
2021 Poster: STEM: A Stochastic Two-Sided Momentum Algorithm Achieving Near-Optimal Sample and Communication Complexities for Federated Learning »
Prashant Khanduri · PRANAY SHARMA · Haibo Yang · Mingyi Hong · Jia Liu · Ketan Rajawat · Pramod Varshney -
2021 Poster: A Near-Optimal Algorithm for Stochastic Bilevel Optimization via Double-Momentum »
Prashant Khanduri · Siliang Zeng · Mingyi Hong · Hoi-To Wai · Zhaoran Wang · Zhuoran Yang -
2021 Poster: ScaleCert: Scalable Certified Defense against Adversarial Patches with Sparse Superficial Layers »
Husheng Han · Kaidi Xu · Xing Hu · Xiaobing Chen · LING LIANG · Zidong Du · Qi Guo · Yanzhi Wang · Yunji Chen -
2021 Poster: When Expressivity Meets Trainability: Fewer than $n$ Neurons Can Work »
Jiawei Zhang · Yushun Zhang · Mingyi Hong · Ruoyu Sun · Zhi-Quan Luo -
2021 Poster: MEST: Accurate and Fast Memory-Economic Sparse Training Framework on the Edge »
Geng Yuan · Xiaolong Ma · Wei Niu · Zhengang Li · Zhenglun Kong · Ning Liu · Yifan Gong · Zheng Zhan · Chaoyang He · Qing Jin · Siyue Wang · Minghai Qin · Bin Ren · Yanzhi Wang · Sijia Liu · Xue Lin -
2020 Poster: Over-parameterized Adversarial Training: An Analysis Overcoming the Curse of Dimensionality »
Yi Zhang · Orestis Plevrakis · Simon Du · Xingguo Li · Zhao Song · Sanjeev Arora -
2020 Poster: Automatic Perturbation Analysis for Scalable Certified Robustness and Beyond »
Kaidi Xu · Zhouxing Shi · Huan Zhang · Yihan Wang · Kai-Wei Chang · Minlie Huang · Bhavya Kailkhura · Xue Lin · Cho-Jui Hsieh -
2020 Poster: Finding Second-Order Stationary Points Efficiently in Smooth Nonconvex Linearly Constrained Optimization Problems »
Songtao Lu · Meisam Razaviyayn · Bo Yang · Kejun Huang · Mingyi Hong -
2020 Poster: Understanding Gradient Clipping in Private SGD: A Geometric Perspective »
Xiangyi Chen · Steven Wu · Mingyi Hong -
2020 Poster: Distributed Training with Heterogeneous Data: Bridging Median- and Mean-Based Algorithms »
Xiangyi Chen · Tiancong Chen · Haoran Sun · Steven Wu · Mingyi Hong -
2020 Spotlight: Understanding Gradient Clipping in Private SGD: A Geometric Perspective »
Xiangyi Chen · Steven Wu · Mingyi Hong -
2020 Spotlight: Finding Second-Order Stationary Points Efficiently in Smooth Nonconvex Linearly Constrained Optimization Problems »
Songtao Lu · Meisam Razaviyayn · Bo Yang · Kejun Huang · Mingyi Hong -
2020 Poster: Provably Efficient Neural GTD for Off-Policy Learning »
Hoi-To Wai · Zhuoran Yang · Zhaoran Wang · Mingyi Hong -
2020 Poster: Provable Online CP/PARAFAC Decomposition of a Structured Tensor via Dictionary Learning »
Sirisha Rambhatla · Xingguo Li · Jarvis Haupt -
2020 Poster: Simulating a Primary Visual Cortex at the Front of CNNs Improves Robustness to Image Perturbations »
Joel Dapello · Tiago Marques · Martin Schrimpf · Franziska Geiger · David Cox · James J DiCarlo -
2020 Spotlight: Simulating a Primary Visual Cortex at the Front of CNNs Improves Robustness to Image Perturbations »
Joel Dapello · Tiago Marques · Martin Schrimpf · Franziska Geiger · David Cox · James J DiCarlo -
2020 : Closing Remarks »
David Cox · Alexander Gray -
2020 Expo Workshop: Perspectives on Neurosymbolic Artificial Intelligence Research »
Alexander Gray · David Cox · Luis Lastras -
2020 : Opening Remarks »
David Cox -
2019 : Lunch break and poster »
Felix Sattler · Khaoula El Mekkaoui · Neta Shoham · Cheng Hong · Florian Hartmann · Boyue Li · Daliang Li · Sebastian Caldas Rivera · Jianyu Wang · Kartikeya Bhardwaj · Tribhuvanesh Orekondy · YAN KANG · Dashan Gao · Mingshu Cong · Xin Yao · Songtao Lu · JIAHUAN LUO · Shicong Cen · Peter Kairouz · Yihan Jiang · Tzu Ming Hsu · Aleksei Triastcyn · Yang Liu · Ahmed Khaled Ragab Bayoumi · Zhicong Liang · Boi Faltings · Seungwhan Moon · Suyi Li · Tao Fan · Tianchi Huang · Chunyan Miao · Hang Qi · Matthew Brown · Lucas Glass · Junpu Wang · Wei Chen · Radu Marculescu · tomer avidor · Xueyang Wu · Mingyi Hong · Ce Ju · John Rush · Ruixiao Zhang · Youchi ZHOU · Françoise Beaufays · Yingxuan Zhu · Lei Xia -
2019 Poster: More Is Less: Learning Efficient Video Representations by Big-Little Network and Depthwise Temporal Aggregation »
Quanfu Fan · Chun-Fu (Richard) Chen · Hilde Kuehne · Marco Pistoia · David Cox -
2019 Poster: Provably Global Convergence of Actor-Critic: A Case for Linear Quadratic Regulator with Ergodic Cost »
Zhuoran Yang · Yongxin Chen · Mingyi Hong · Zhaoran Wang -
2019 Poster: Variance Reduced Policy Evaluation with Smooth Function Approximation »
Hoi-To Wai · Mingyi Hong · Zhuoran Yang · Zhaoran Wang · Kexin Tang -
2018 Poster: Zeroth-Order Stochastic Variance Reduction for Nonconvex Optimization »
Sijia Liu · Bhavya Kailkhura · Pin-Yu Chen · Paishun Ting · Shiyu Chang · Lisa Amini -
2018 Poster: Multi-Agent Reinforcement Learning via Double Averaging Primal-Dual Optimization »
Hoi-To Wai · Zhuoran Yang · Zhaoran Wang · Mingyi Hong