Skip to yearly menu bar Skip to main content


Poster

On the Complexity of Learning Sparse Functions with Statistical and Gradient Queries

Nirmit Joshi · Theodor Misiakiewicz · Nati Srebro

West Ballroom A-D #5608
[ ]
[ Paper [ Slides [ Poster [ OpenReview
Wed 11 Dec 4:30 p.m. PST — 7:30 p.m. PST

Abstract: The goal of this paper is to investigate the complexity of gradient algorithms when learning sparse functions (juntas). We introduce a type of Statistical Queries (SQ), which we call Differentiable Learning Queries (DLQ), to model gradient queries on a specified loss with respect to an arbitrary model. We provide a tight characterization of the query complexity of DLQ for learning the support of a sparse function over generic product distributions. This complexity crucially depends on the loss function. For the squared loss, DLQ matches the complexity of Correlation Statistical Queries (CSQ)—potentially much worse than SQ. But for other simple loss functions, including the 1 loss, DLQ always achieves the same complexity as SQ. We also provide evidence that DLQ can indeed capture learning with (stochastic) gradient descent by showing it correctly describes the complexity of learning with a two-layer neural network in the mean field regime and linear scaling.

Chat is not available.