NeurIPS Poster Sharpness-Aware Minimization Activates the Interactive Teaching's Understanding and Optimization

Poster

Sharpness-Aware Minimization Activates the Interactive Teaching's Understanding and Optimization

Mingwei Xu · Xiaofeng Cao · Ivor Tsang

West Ballroom A-D #7106

[ Abstract ]

[ Paper] [ Slides] [ Poster] [ OpenReview]

Wed 11 Dec 11 a.m. PST — 2 p.m. PST

Abstract:

Teaching is a potentially effective approach for understanding interactions among multiple intelligences. Previous explorations have convincingly shown that teaching presents additional opportunities for observation and demonstration within the learning model, such as data distillation and selection. However, the underlying optimization principles and convergence of interactive teaching lack theoretical analysis, and in this regard co-teaching serves as a notable prototype. In this paper, we discuss its role as a reduction of the larger loss landscape derived from Sharpness-Aware Minimization (SAM). Then, we classify it as an iterative parameter estimation process using Expectation-Maximization. The convergence of this typical interactive teaching is achieved by continuously optimizing a variational lower bound on the log marginal likelihood. This lower bound represents the expected value of the log posterior distribution of the latent variables under a scaled, factorized variational distribution. To further enhance interactive teaching's performance, we incorporate SAM's strong generalization information into interactive teaching, referred as Sharpness Reduction Interactive Teaching (SRIT). This integration can be viewed as a novel sequential optimization process. Finally, we validate the performance of our approach through multiple experiments.

Chat is not available.