Random Reshuffling is Not Always Better

Christopher De Sa

Spotlight presentation: Orals & Spotlights Track 30: Optimization/Theory
on 2020-12-10T07:30:00-08:00 - 2020-12-10T07:40:00-08:00
Poster Session 6 (more posters)
on 2020-12-10T09:00:00-08:00 - 2020-12-10T11:00:00-08:00
GatherTown: Core machine learning & optimization ( Town D4 - Spot B1 )
Join GatherTown
Only iff poster is crowded, join Zoom . Authors have to start the Zoom call from their Profile page / Presentation History.
Abstract: Many learning algorithms, such as stochastic gradient descent, are affected by the order in which training examples are used. It is often observed that sampling the training examples without-replacement, also known as random reshuffling, causes learning algorithms to converge faster. We give a counterexample to the Operator Inequality of Noncommutative Arithmetic and Geometric Means, a longstanding conjecture that relates to the performance of random reshuffling in learning algorithms (Recht and Ré, "Toward a noncommutative arithmetic-geometric mean inequality: conjectures, case-studies, and consequences," COLT 2012). We use this to give an example of a learning task and algorithm for which with-replacement random sampling actually outperforms random reshuffling.

Preview Video and Chat

Chat is not available.