Poster
Lower Bounds of Uniform Stability in Gradient-Based Bilevel Algorithms for Hyperparameter Optimization
Rongzhen Wang · Chenyu Zheng · Guoqiang Wu · Xu Min · Xiaolu Zhang · Jun Zhou · Chongxuan LI
West Ballroom A-D #5800
Gradient-based bilevel programming leverages unrolling differentiation (UD) or implicit function theorem (IFT) to solve hyperparameter optimization (HO) problems, and is proven effective and scalable in practice. To understand the generalization behavior, existing work establishes upper bounds of uniform stability in these algorithms, while their tightness is still unclear. To this end, this paper attempts to establish lower bounds for UD-based and IFT-based algorithms. The intricate behavior of bilevel programming, i.e. the dependence of each outer-level update on the current turn of inner optimization, poses challenges for deriving stability lower bounds.To address this problem, we introduce lower-bounded expansion properties to characterize the instability in update rules which can serve as general tools for lower-bound analysis. These properties guarantee the hyperparameter divergence at the outer level and the Lipschitz constant of inner output at the inner level in the context of HO.Guided by these insights, we construct a quadratic example that yields tight lower bounds for UD-based algorithms and meaningful bounds for representative IFT-based algorithms.Our tight result indicates that uniform stability has reached its limit in stability analysis for UD-based algorithms.
Live content is unavailable. Log in and register to view live content