Timezone: »
Score matching is an alternative to maximum likelihood (ML) for estimating a probability distribution parametrized up to a constant of proportionality. By fitting the ''score'' of the distribution, it sidesteps the need to compute this constant of proportionality (which is often intractable).While score matching and variants thereof are popular in practice, precise theoretical understanding of the benefits and tradeoffs with maximum likelihood---both computational and statistical---are not well understood. In this work, we give the first example of a natural exponential family of distributions such that the score matching loss is computationally efficient to optimize, and has a comparable statistical efficiency to ML, while the ML loss is intractable to optimize using a gradient-based method. The family consists of exponentials of polynomials of fixed degree, and our result can be viewed as a continuous analogue of recent developments in the discrete setting. Precisely, we show: (1) Designing a zeroth-order or first-order oracle for optimizing the maximum likelihood loss is NP-hard. (2) Maximum likelihood has a statistical efficiency polynomial in the ambient dimension and the radius of the parameters of the family. (3) Minimizing the score matching loss is both computationally and statistically efficient, with complexity polynomial in the ambient dimension.
Author Information
Chirag Pabbaraju (Stanford University)
Dhruv Rohatgi (Massachusetts Institute of Technology)
Anish Prasad Sevekari (Carnegie Mellon University)
Holden Lee (Johns Hopkins University)
Ankur Moitra (MIT)
Andrej Risteski (CMU)
Assistant Professor in the ML department at CMU. Prior to that I was a Wiener Fellow at MIT, and prior to that finished my PhD at Princeton University.
More from the Same Authors
-
2021 Spotlight: Parametric Complexity Bounds for Approximating PDEs with Neural Networks »
Tanya Marwah · Zachary Lipton · Andrej Risteski -
2021 : Robust Algorithms for GMM Estimation: A Finite Sample Viewpoint »
Dhruv Rohatgi -
2022 : Domain-Adjusted Regression or: ERM May Already Learn Features Sufficient for Out-of-Distribution Generalization »
Elan Rosenfeld · Pradeep Ravikumar · Andrej Risteski -
2022 : Statistical Efficiency of Score Matching: The View from Isoperimetry »
Frederic Koehler · Alexander Heckett · Andrej Risteski -
2023 : Outliers with Opposing Signals Have an Outsized Effect on Neural Network Optimization »
Elan Rosenfeld · Andrej Risteski -
2023 : Fit Like You Sample: Sample-Efficient Score Matching From Fast Mixing Diffusions »
Yilong Qin · Andrej Risteski -
2023 : Outliers with Opposing Signals Have an Outsized Effect on Neural Network Optimization »
Elan Rosenfeld · Andrej Risteski -
2023 : Fit Like You Sample: Sample-Efficient Score Matching From Fast Mixing Diffusions »
Yilong Qin · Andrej Risteski -
2023 : Outliers with Opposing Signals Have an Outsized Effect on Neural Network Optimization »
Elan Rosenfeld · Andrej Risteski -
2023 : Invited Talk 1 »
Andrej Risteski -
2023 Poster: Progressive Ensemble Distillation: Building Ensembles for Efficient Inference »
Don Dennis · Abhishek Shetty · Anish Prasad Sevekari · Kazuhito Koishida · Virginia Smith -
2023 Poster: Deep Equilibrium Based Neural Operators for Steady-State PDEs »
Tanya Marwah · Ashwini Pokle · J. Zico Kolter · Zachary Lipton · Jianfeng Lu · Andrej Risteski -
2023 Poster: Connecting Pre-trained Language Model and Downstream Task via Properties of Representation »
Chenwei Wu · Holden Lee · Rong Ge -
2023 Poster: Transformers are uninterpretable with myopic methods: a case study with bounded Dyck grammars »
Kaiyue Wen · Yuchen Li · Bingbin Liu · Andrej Risteski -
2023 Poster: Harnessing the power of choices in decision tree learning »
Guy Blanc · Jane Lange · Chirag Pabbaraju · Colin Sullivan · Li-Yang Tan · Mo Tiwari -
2023 Poster: Provable benefits of annealing for estimating normalizing constants: Importance Sampling, Noise-Contrastive Estimation, and beyond »
Omar Chehab · Aapo Hyvarinen · Andrej Risteski -
2023 Poster: Feature Adaptation for Sparse Linear Regression »
Jonathan Kelner · Frederic Koehler · Raghu Meka · Dhruv Rohatgi -
2023 Poster: The probability flow ODE is provably fast »
Sitan Chen · Sinho Chewi · Holden Lee · Yuanzhi Li · Jianfeng Lu · Adil Salim -
2022 : Domain-Adjusted Regression or: ERM May Already Learn Features Sufficient for Out-of-Distribution Generalization »
Elan Rosenfeld · Pradeep Ravikumar · Andrej Risteski -
2022 Poster: Robust Generalized Method of Moments: A Finite Sample Viewpoint »
Dhruv Rohatgi · Vasilis Syrgkanis -
2022 Poster: Polynomial time guarantees for the Burer-Monteiro method »
Diego Cifuentes · Ankur Moitra -
2022 Poster: Lower Bounds on Randomly Preconditioned Lasso via Robust Sparse Designs »
Jonathan Kelner · Frederic Koehler · Raghu Meka · Dhruv Rohatgi -
2022 Poster: Learning in Observable POMDPs, without Computationally Intractable Oracles »
Noah Golowich · Ankur Moitra · Dhruv Rohatgi -
2022 Poster: Iterative Feature Matching: Toward Provable Domain Generalization with Logarithmic Environments »
Yining Chen · Elan Rosenfeld · Mark Sellke · Tengyu Ma · Andrej Risteski -
2022 Poster: Masked Prediction: A Parameter Identifiability View »
Bingbin Liu · Daniel Hsu · Pradeep Ravikumar · Andrej Risteski -
2022 Poster: Robust Model Selection and Nearly-Proper Learning for GMMs »
Allen Liu · Jerry Li · Ankur Moitra -
2022 Poster: Continual learning: a feature extraction formalization, an efficient algorithm, and fundamental obstructions »
Binghui Peng · Andrej Risteski -
2021 : Zoom Q&A for Contributed talks Session 3 »
Dhruv Rohatgi -
2021 : Contributed talks Session 3 »
Dhruv Rohatgi -
2021 Poster: Parametric Complexity Bounds for Approximating PDEs with Neural Networks »
Tanya Marwah · Zachary Lipton · Andrej Risteski -
2021 Poster: Universal Approximation Using Well-Conditioned Normalizing Flows »
Holden Lee · Chirag Pabbaraju · Anish Prasad Sevekari · Andrej Risteski -
2021 Poster: A No-go Theorem for Robust Acceleration in the Hyperbolic Plane »
Linus Hamilton · Ankur Moitra -
2020 Poster: Learning Some Popular Gaussian Graphical Models without Condition Number Bounds »
Jonathan Kelner · Frederic Koehler · Raghu Meka · Ankur Moitra -
2020 Spotlight: Learning Some Popular Gaussian Graphical Models without Condition Number Bounds »
Jonathan Kelner · Frederic Koehler · Raghu Meka · Ankur Moitra -
2020 Poster: Truncated Linear Regression in High Dimensions »
Constantinos Daskalakis · Dhruv Rohatgi · Emmanouil Zampetakis -
2020 Poster: Tensor Completion Made Practical »
Allen Liu · Ankur Moitra -
2020 Poster: Efficient semidefinite-programming-based inference for binary and multi-class MRFs »
Chirag Pabbaraju · Po-Wei Wang · J. Zico Kolter -
2020 Spotlight: Efficient semidefinite-programming-based inference for binary and multi-class MRFs »
Chirag Pabbaraju · Po-Wei Wang · J. Zico Kolter -
2020 Poster: Constant-Expansion Suffices for Compressed Sensing with Generative Priors »
Constantinos Daskalakis · Dhruv Rohatgi · Emmanouil Zampetakis -
2020 Poster: Classification Under Misspecification: Halfspaces, Generalized Linear Models, and Evolvability »
Sitan Chen · Frederic Koehler · Ankur Moitra · Morris Yau -
2020 Poster: Learning Structured Distributions From Untrusted Batches: Faster and Simpler »
Sitan Chen · Jerry Li · Ankur Moitra -
2020 Spotlight: Constant-Expansion Suffices for Compressed Sensing with Generative Priors »
Constantinos Daskalakis · Dhruv Rohatgi · Emmanouil Zampetakis -
2020 Spotlight: Classification Under Misspecification: Halfspaces, Generalized Linear Models, and Evolvability »
Sitan Chen · Frederic Koehler · Ankur Moitra · Morris Yau -
2018 Poster: Robust Subspace Approximation in a Stream »
Roie Levin · Anish Prasad Sevekari · David Woodruff -
2018 Spotlight: Robust Subspace Approximation in a Stream »
Roie Levin · Anish Prasad Sevekari · David Woodruff -
2018 Poster: Multiple Instance Learning for Efficient Sequential Data Classification on Resource-constrained Devices »
Don Dennis · Chirag Pabbaraju · Harsha Vardhan Simhadri · Prateek Jain -
2017 Poster: Information Theoretic Properties of Markov Random Fields, and their Algorithmic Applications »
Linus Hamilton · Frederic Koehler · Ankur Moitra