Timezone: »
Computing the gradient of model hyperparameters, i.e., hypergradient, enables a promising and natural way to solve the hyperparameter optimization task. However, gradient-based methods could lead to suboptimal solutions due to the non-convex nature of optimization in a complex hyperparameter space. In this study, we propose a hyperparameter mutation (HPM) algorithm to explicitly consider a learnable trade-off between using global and local search, where we adopt a population of student models to simultaneously explore the hyperparameter space guided by hypergradient and leverage a teacher model to mutate the underperforming students by exploiting the top ones. The teacher model is implemented with an attention mechanism and is used to learn a mutation schedule for different hyperparameters on the fly. Empirical evidence on synthetic functions is provided to show that HPM outperforms hypergradient significantly. Experiments on two benchmark datasets are also conducted to validate the effectiveness of the proposed HPM algorithm for training deep neural networks compared with several strong baselines.
Author Information
Zhiqiang Tao (Santa Clara University)
Yaliang Li (Alibaba Group)
Bolin Ding ("Data Analytics and Intelligence Lab, Alibaba Group")
Ce Zhang (ETH Zurich)
Jingren Zhou (Alibaba Group)
Yun Fu (Northeastern University)
More from the Same Authors
-
2021 : Evaluating Bayes Error Estimators on Real-World Datasets with FeeBee »
Cedric Renggli · Luka Rimanic · Nora Hollenstein · Ce Zhang -
2021 Spotlight: Aligned Structured Sparsity Learning for Efficient Image Super-Resolution »
Yulun Zhang · Huan Wang · Can Qin · Yun Fu -
2022 Poster: VF-PS: How to Select Important Participants in Vertical Federated Learning, Efficiently and Securely? »
Jiawei Jiang · Lukas Burkhalter · Fangcheng Fu · Bolin Ding · Bo Du · Anwar Hithnawi · Bo Li · Ce Zhang -
2022 : GraphFramEx: Towards Systematic Evaluation of Explainability Methods for Graph Neural Networks »
Kenza Amara · Rex Ying · Ce Zhang -
2022 : Improving Vertical Federated Learning by Efficient Communication with ADMM »
Chulin Xie · Pin-Yu Chen · Ce Zhang · Bo Li -
2022 Spotlight: Certifying Some Distributional Fairness with Subpopulation Decomposition »
Mintong Kang · Linyi Li · Maurice Weber · Yang Liu · Ce Zhang · Bo Li -
2022 Spotlight: Lightning Talks 1A-3 »
Kimia Noorbakhsh · Ronan Perry · Qi Lyu · Jiawei Jiang · Christian Toth · Olivier Jeunen · Xin Liu · Yuan Cheng · Lei Li · Manuel Rodriguez · Julius von Kügelgen · Lars Lorch · Nicolas Donati · Lukas Burkhalter · Xiao Fu · Zhongdao Wang · Songtao Feng · Ciarán Gilligan-Lee · Rishabh Mehrotra · Fangcheng Fu · Jing Yang · Bernhard Schölkopf · Ya-Li Li · Christian Knoll · Maks Ovsjanikov · Andreas Krause · Shengjin Wang · Hong Zhang · Mounia Lalmas · Bolin Ding · Bo Du · Yingbin Liang · Franz Pernkopf · Robert Peharz · Anwar Hithnawi · Julius von Kügelgen · Bo Li · Ce Zhang -
2022 Spotlight: VF-PS: How to Select Important Participants in Vertical Federated Learning, Efficiently and Securely? »
Jiawei Jiang · Lukas Burkhalter · Fangcheng Fu · Bolin Ding · Bo Du · Anwar Hithnawi · Bo Li · Ce Zhang -
2022 Poster: Improving Certified Robustness via Statistical Learning with Logical Reasoning »
Zhuolin Yang · Zhikuan Zhao · Boxin Wang · Jiawei Zhang · Linyi Li · Hengzhi Pei · Bojan Karlaš · Ji Liu · Heng Guo · Ce Zhang · Bo Li -
2022 Poster: Certifying Some Distributional Fairness with Subpopulation Decomposition »
Mintong Kang · Linyi Li · Maurice Weber · Yang Liu · Ce Zhang · Bo Li -
2022 Poster: Look More but Care Less in Video Recognition »
Yitian Zhang · Yue Bai · Huan Wang · Yi Xu · Yun Fu -
2022 Poster: Decentralized Training of Foundation Models in Heterogeneous Environments »
Binhang Yuan · Yongjun He · Jared Davis · Tianyi Zhang · Tri Dao · Beidi Chen · Percy Liang · Christopher Ré · Ce Zhang -
2022 Poster: Parameter-Efficient Masking Networks »
Yue Bai · Huan Wang · Xu Ma · Yitian Zhang · Zhiqiang Tao · Yun Fu -
2022 Poster: Fine-tuning Language Models over Slow Networks using Activation Quantization with Guarantees »
Jue WANG · Binhang Yuan · Luka Rimanic · Yongjun He · Tri Dao · Beidi Chen · Christopher Ré · Ce Zhang -
2022 Poster: What Makes a "Good" Data Augmentation in Knowledge Distillation - A Statistical Perspective »
Huan Wang · Suhas Lohit · Michael Jones · Yun Fu -
2021 Poster: Slow Learning and Fast Inference: Efficient Graph Similarity Computation via Knowledge Distillation »
Can Qin · Handong Zhao · Lichen Wang · Huan Wang · Yulun Zhang · Yun Fu -
2021 Poster: Aligned Structured Sparsity Learning for Efficient Image Super-Resolution »
Yulun Zhang · Huan Wang · Can Qin · Yun Fu -
2021 Poster: Low-Rank Subspaces in GANs »
Jiapeng Zhu · Ruili Feng · Yujun Shen · Deli Zhao · Zheng-Jun Zha · Jingren Zhou · Qifeng Chen -
2021 Poster: UFC-BERT: Unifying Multi-Modal Controls for Conditional Image Synthesis »
Zhu Zhang · Jianxin Ma · Chang Zhou · Rui Men · Zhikang Li · Ming Ding · Jie Tang · Jingren Zhou · Hongxia Yang -
2021 Poster: TRS: Transferability Reduced Ensemble via Promoting Gradient Diversity and Model Smoothness »
Zhuolin Yang · Linyi Li · Xiaojun Xu · Shiliang Zuo · Qian Chen · Pan Zhou · Benjamin Rubinstein · Ce Zhang · Bo Li -
2020 Poster: Scalable Graph Neural Networks via Bidirectional Propagation »
Ming Chen · Zhewei Wei · Bolin Ding · Yaliang Li · Ye Yuan · Xiaoyong Du · Ji-Rong Wen -
2020 Poster: Spectral Temporal Graph Neural Network for Multivariate Time-series Forecasting »
Defu Cao · Yujing Wang · Juanyong Duan · Ce Zhang · Xia Zhu · Congrui Huang · Yunhai Tong · Bixiong Xu · Jing Bai · Jie Tong · Qi Zhang -
2020 Spotlight: Spectral Temporal Graph Neural Network for Multivariate Time-series Forecasting »
Defu Cao · Yujing Wang · Juanyong Duan · Ce Zhang · Xia Zhu · Congrui Huang · Yunhai Tong · Bixiong Xu · Jing Bai · Jie Tong · Qi Zhang -
2020 Poster: Neural Sparse Representation for Image Restoration »
Yuchen Fan · Jiahui Yu · Yiqun Mei · Yulun Zhang · Yun Fu · Ding Liu · Thomas Huang -
2020 Poster: On Convergence of Nearest Neighbor Classifiers over Feature Transformations »
Luka Rimanic · Cedric Renggli · Bo Li · Ce Zhang -
2020 Expo Workshop: New Challenges in User-Generated Content »
Yaliang Li · Bolin Ding · Jinyang Gao · Shuonan Zhang -
2020 : Collecting Sensitive User-Generated Data Privately »
Bolin Ding -
2020 : Welcome and Introduction by Workshop Organizers »
Yaliang Li -
2019 Poster: PointDAN: A Multi-Scale 3D Domain Adaption Network for Point Cloud Representation »
Can Qin · Haoxuan You · Lichen Wang · C.-C. Jay Kuo · Yun Fu -
2018 Poster: Communication Compression for Decentralized Training »
Hanlin Tang · Shaoduo Gan · Ce Zhang · Tong Zhang · Ji Liu -
2017 Poster: Can Decentralized Algorithms Outperform Centralized Algorithms? A Case Study for Decentralized Parallel Stochastic Gradient Descent »
Xiangru Lian · Ce Zhang · Huan Zhang · Cho-Jui Hsieh · Wei Zhang · Ji Liu -
2017 Oral: Can Decentralized Algorithms Outperform Centralized Algorithms? A Case Study for Decentralized Parallel Stochastic Gradient Descent »
Xiangru Lian · Ce Zhang · Huan Zhang · Cho-Jui Hsieh · Wei Zhang · Ji Liu -
2017 Poster: Matching on Balanced Nonlinear Representations for Treatment Effects Estimation »
Sheng Li · Yun Fu -
2012 Poster: Fast Resampling Weighted v-Statistics »
Chunxiao Zhou · jiseong Park · Yun Fu -
2012 Spotlight: Fast Resampling Weighted v-Statistics »
Chunxiao Zhou · jiseong Park · Yun Fu