Timezone: »

Efficient Hyper-parameter Optimization with Cubic Regularization
Zhenqian Shen · Hansi Yang · Yong Li · James Kwok · Quanming Yao

Thu Dec 14 08:45 AM -- 10:45 AM (PST) @ Great Hall & Hall B1+B2 #1110

As hyper-parameters are ubiquitous and can significantly affect the model performance, hyper-parameter optimization is extremely important in machine learning. In this paper, we consider a sub-class of hyper-parameter optimization problems, where the hyper-gradients are not available. Such problems frequently appear when the performance metric is non-differentiable or the hyper-parameter is not continuous. However, existing algorithms, like Bayesian optimization and reinforcement learning, often get trapped in local optimals with poor performance. To address the above limitations, we propose to use cubic regularization to accelerate convergence and avoid saddle points. First, we adopt stochastic relaxation, which allows obtaining gradient and Hessian information without hyper-gradients. Then, we exploit the rich curvature information by cubic regularization. Theoretically, we prove that the proposed method can converge to approximate second-order stationary points, and the convergence is also guaranteed when the lower-level problem is inexactly solved. Experiments on synthetic and real-world data demonstrate the effectiveness of our proposed method.

Author Information

Zhenqian Shen (Tsinghua University, Tsinghua University)
Hansi Yang (The Hong Kong University of Science and Technology)
Yong Li (Tsinghua University)
James Kwok (Hong Kong University of Science and Technology)
Quanming Yao (Tsinghua University)

More from the Same Authors