Timezone: »
We consider the class of optimization problems arising from computationally intensive L1-regularized M-estimators, where the function or gradient values are very expensive to compute. A particular instance of interest is the L1-regularized MLE for learning Conditional Random Fields (CRFs), which are a popular class of statistical models for varied structured prediction problems such as sequence labeling, alignment, and classification with label taxonomy. L1-regularized MLEs for CRFs are particularly expensive to optimize since computing the gradient values requires an expensive inference step. In this work, we propose the use of a carefully constructed proximal quasi-Newton algorithm for such computationally intensive M-estimation problems, where we employ an aggressive active set selection technique. In a key contribution of the paper, we show that our proximal quasi-Newton algorithm is provably super-linearly convergent, even in the absence of strong convexity, by leveraging a restricted variant of strong convexity. In our experiments, the proposed algorithm converges considerably faster than current state-of-the-art on the problems of sequence labeling and hierarchical classification.
Author Information
Kai Zhong (Amazon)
Ian En-Hsu Yen (University of Texas at Austin)
Inderjit Dhillon (Google & UT Austin)
Pradeep Ravikumar (Carnegie Mellon University)
More from the Same Authors
-
2022 : Differentially Private Federated Learning with Normalized Updates »
Rudrajit Das · Abolfazl Hashemi · Sujay Sanghavi · Inderjit Dhillon -
2023 Poster: Block Low-Rank Preconditioner with Shared Basis for Stochastic Optimization »
Jui-Nan Yen · Sai Surya Duvvuri · Inderjit Dhillon · Cho-Jui Hsieh -
2023 Poster: A Computationally Efficient Sparsified Online Newton Method »
Fnu Devvrit · Sai Surya Duvvuri · Rohan Anil · Vineet Gupta · Cho-Jui Hsieh · Inderjit Dhillon -
2022 Poster: S3GC: Scalable Self-Supervised Graph Clustering »
Fnu Devvrit · Aditya Sinha · Inderjit Dhillon · Prateek Jain -
2022 Poster: ELIAS: End-to-End Learning to Index and Search in Large Output Spaces »
Nilesh Gupta · Patrick Chen · Hsiang-Fu Yu · Cho-Jui Hsieh · Inderjit Dhillon -
2021 Poster: Fast Multi-Resolution Transformer Fine-tuning for Extreme Multi-label Text Classification »
Jiong Zhang · Wei-Cheng Chang · Hsiang-Fu Yu · Inderjit Dhillon -
2021 Poster: Label Disentanglement in Partition-based Extreme Multilabel Classification »
Xuanqing Liu · Wei-Cheng Chang · Hsiang-Fu Yu · Cho-Jui Hsieh · Inderjit Dhillon -
2021 Poster: DRONE: Data-aware Low-rank Compression for Large NLP Models »
Patrick Chen · Hsiang-Fu Yu · Inderjit Dhillon · Cho-Jui Hsieh -
2019 : Lunch Break and Posters »
Xingyou Song · Elad Hoffer · Wei-Cheng Chang · Jeremy Cohen · Jyoti Islam · Yaniv Blumenfeld · Andreas Madsen · Jonathan Frankle · Sebastian Goldt · Satrajit Chatterjee · Abhishek Panigrahi · Alex Renda · Brian Bartoldson · Israel Birhane · Aristide Baratin · Niladri Chatterji · Roman Novak · Jessica Forde · YiDing Jiang · Yilun Du · Linara Adilova · Michael Kamp · Berry Weinstein · Itay Hubara · Tal Ben-Nun · Torsten Hoefler · Daniel Soudry · Hsiang-Fu Yu · Kai Zhong · Yiming Yang · Inderjit Dhillon · Jaime Carbonell · Yanqing Zhang · Dar Gilboa · Johannes Brandstetter · Alexander R Johansen · Gintare Karolina Dziugaite · Raghav Somani · Ari Morcos · Freddie Kalaitzis · Hanie Sedghi · Lechao Xiao · John Zech · Muqiao Yang · Simran Kaur · Qianli Ma · Yao-Hung Hubert Tsai · Ruslan Salakhutdinov · Sho Yaida · Zachary Lipton · Daniel Roy · Michael Carbin · Florent Krzakala · Lenka Zdeborová · Guy Gur-Ari · Ethan Dyer · Dilip Krishnan · Hossein Mobahi · Samy Bengio · Behnam Neyshabur · Praneeth Netrapalli · Kris Sankaran · Julien Cornebise · Yoshua Bengio · Vincent Michalski · Samira Ebrahimi Kahou · Md Rifat Arefin · Jiri Hron · Jaehoon Lee · Jascha Sohl-Dickstein · Samuel Schoenholz · David Schwab · Dongyu Li · Sang Choe · Henning Petzka · Ashish Verma · Zhichao Lin · Cristian Sminchisescu -
2019 Poster: Provable Non-linear Inductive Matrix Completion »
Kai Zhong · Zhao Song · Prateek Jain · Inderjit Dhillon -
2019 Poster: Inverting Deep Generative models, One layer at a time »
Qi Lei · Ajil Jalal · Inderjit Dhillon · Alex Dimakis -
2019 Poster: Think Globally, Act Locally: A Deep Neural Network Approach to High-Dimensional Time Series Forecasting »
Rajat Sen · Hsiang-Fu Yu · Inderjit Dhillon -
2019 Poster: AutoAssist: A Framework to Accelerate Training of Deep Neural Networks »
Jiong Zhang · Hsiang-Fu Yu · Inderjit Dhillon -
2019 Poster: Primal-Dual Block Generalized Frank-Wolfe »
Qi Lei · JIACHENG ZHUO · Constantine Caramanis · Inderjit Dhillon · Alex Dimakis -
2018 Poster: MixLasso: Generalized Mixed Regression via Convex Atomic-Norm Regularization »
Ian En-Hsu Yen · Wei-Cheng Lee · Kai Zhong · Sung-En Chang · Pradeep Ravikumar · Shou-De Lin -
2017 Poster: A Greedy Approach for Budgeted Maximum Inner Product Search »
Hsiang-Fu Yu · Cho-Jui Hsieh · Qi Lei · Inderjit Dhillon -
2016 Poster: Asynchronous Parallel Greedy Coordinate Descent »
Yang You · Xiangru Lian · Ji Liu · Hsiang-Fu Yu · Inderjit Dhillon · James Demmel · Cho-Jui Hsieh -
2016 Poster: Coordinate-wise Power Method »
Qi Lei · Kai Zhong · Inderjit Dhillon -
2016 Poster: Structured Sparse Regression via Greedy Hard Thresholding »
Prateek Jain · Nikhil Rao · Inderjit Dhillon -
2016 Poster: Mixed Linear Regression with Multiple Components »
Kai Zhong · Prateek Jain · Inderjit Dhillon -
2016 Poster: Temporal Regularized Matrix Factorization for High-dimensional Time Series Prediction »
Hsiang-Fu Yu · Nikhil Rao · Inderjit Dhillon -
2016 Poster: Dual Decomposed Learning with Factorwise Oracle for Structural SVM of Large Output Domain »
Ian En-Hsu Yen · Xiangru Huang · Kai Zhong · Ruohan Zhang · Pradeep Ravikumar · Inderjit Dhillon -
2015 Workshop: Multiresolution methods for large-scale learning »
Inderjit Dhillon · Risi Kondor · Rob Nowak · Michael O'Neil · Nedelina Teneva -
2015 Poster: Fast Classification Rates for High-dimensional Gaussian Generative Models »
Tianyang Li · Adarsh Prasad · Pradeep Ravikumar -
2015 Poster: Matrix Completion with Noisy Side Information »
Kai-Yang Chiang · Cho-Jui Hsieh · Inderjit Dhillon -
2015 Poster: Collaborative Filtering with Graph Information: Consistency and Scalable Methods »
Nikhil Rao · Hsiang-Fu Yu · Pradeep Ravikumar · Inderjit Dhillon -
2015 Spotlight: Collaborative Filtering with Graph Information: Consistency and Scalable Methods »
Nikhil Rao · Hsiang-Fu Yu · Pradeep Ravikumar · Inderjit Dhillon -
2015 Spotlight: Matrix Completion with Noisy Side Information »
Kai-Yang Chiang · Cho-Jui Hsieh · Inderjit Dhillon -
2015 Poster: Beyond Sub-Gaussian Measurements: High-Dimensional Structured Estimation with Sub-Exponential Designs »
Vidyashankar Sivakumar · Arindam Banerjee · Pradeep Ravikumar -
2015 Poster: Sparse Linear Programming via Primal and Dual Augmented Coordinate Descent »
Ian En-Hsu Yen · Kai Zhong · Cho-Jui Hsieh · Pradeep Ravikumar · Inderjit Dhillon -
2015 Poster: Fixed-Length Poisson MRF: Adding Dependencies to the Multinomial »
David I Inouye · Pradeep Ravikumar · Inderjit Dhillon -
2015 Poster: Consistent Multilabel Classification »
Oluwasanmi Koyejo · Nagarajan Natarajan · Pradeep Ravikumar · Inderjit Dhillon -
2015 Poster: A Dual Augmented Block Minimization Framework for Learning with Limited Memory »
Ian En-Hsu Yen · Shan-Wei Lin · Shou-De Lin -
2015 Poster: Closed-form Estimators for High-dimensional Generalized Linear Models »
Eunho Yang · Aurelie Lozano · Pradeep Ravikumar -
2015 Spotlight: Closed-form Estimators for High-dimensional Generalized Linear Models »
Eunho Yang · Aurelie Lozano · Pradeep Ravikumar -
2014 Poster: QUIC & DIRTY: A Quadratic Approximation Approach for Dirty Statistical Models »
Cho-Jui Hsieh · Inderjit Dhillon · Pradeep Ravikumar · Stephen Becker · Peder A Olsen -
2014 Poster: Consistent Binary Classification with Generalized Performance Metrics »
Sanmi Koyejo · Nagarajan Natarajan · Pradeep Ravikumar · Inderjit Dhillon -
2014 Poster: Fast Prediction for Large-Scale Kernel Machines »
Cho-Jui Hsieh · Si Si · Inderjit Dhillon -
2014 Poster: Multi-Scale Spectral Decomposition of Massive Graphs »
Si Si · Donghyuk Shin · Inderjit Dhillon · Beresford N Parlett -
2014 Poster: On the Information Theoretic Limits of Learning Ising Models »
Rashish Tandon · Karthikeyan Shanmugam · Pradeep Ravikumar · Alex Dimakis -
2014 Poster: Sparse Random Feature Algorithm as Coordinate Descent in Hilbert Space »
Ian En-Hsu Yen · Ting-Wei Lin · Shou-De Lin · Pradeep Ravikumar · Inderjit Dhillon -
2014 Spotlight: Consistent Binary Classification with Generalized Performance Metrics »
Sanmi Koyejo · Nagarajan Natarajan · Pradeep Ravikumar · Inderjit Dhillon -
2014 Poster: A Representation Theory for Ranking Functions »
Harsh H Pareek · Pradeep Ravikumar -
2014 Poster: Capturing Semantically Meaningful Word Dependencies with an Admixture of Poisson MRFs »
David I Inouye · Pradeep Ravikumar · Inderjit Dhillon -
2014 Poster: Constant Nullspace Strong Convexity and Fast Convergence of Proximal Methods under High-Dimensional Settings »
Ian En-Hsu Yen · Cho-Jui Hsieh · Pradeep Ravikumar · Inderjit Dhillon -
2014 Poster: Elementary Estimators for Graphical Models »
Eunho Yang · Aurelie Lozano · Pradeep Ravikumar -
2013 Workshop: Discrete Optimization in Machine Learning: Connecting Theory and Practice »
Stefanie Jegelka · Andreas Krause · Pradeep Ravikumar · Kazuo Murota · Jeffrey A Bilmes · Yisong Yue · Michael Jordan -
2013 Poster: Conditional Random Fields via Univariate Exponential Families »
Eunho Yang · Pradeep Ravikumar · Genevera I Allen · Zhandong Liu -
2013 Poster: On Poisson Graphical Models »
Eunho Yang · Pradeep Ravikumar · Genevera I Allen · Zhandong Liu -
2013 Poster: BIG & QUIC: Sparse Inverse Covariance Estimation for a Million Variables »
Cho-Jui Hsieh · Matyas A Sustik · Inderjit Dhillon · Pradeep Ravikumar · Russell Poldrack -
2013 Oral: BIG & QUIC: Sparse Inverse Covariance Estimation for a Million Variables »
Cho-Jui Hsieh · Matyas A Sustik · Inderjit Dhillon · Pradeep Ravikumar · Russell Poldrack -
2013 Poster: Dirty Statistical Models »
Eunho Yang · Pradeep Ravikumar -
2013 Poster: Large Scale Distributed Sparse Precision Estimation »
Huahua Wang · Arindam Banerjee · Cho-Jui Hsieh · Pradeep Ravikumar · Inderjit Dhillon -
2013 Poster: Learning with Noisy Labels »
Nagarajan Natarajan · Inderjit Dhillon · Pradeep Ravikumar · Ambuj Tewari -
2012 Workshop: Discrete Optimization in Machine Learning (DISCML): Structure and Scalability »
Stefanie Jegelka · Andreas Krause · Jeffrey A Bilmes · Pradeep Ravikumar -
2012 Poster: Graphical Models via Generalized Linear Models »
Eunho Yang · Pradeep Ravikumar · Genevera I Allen · zhandong Liu -
2012 Oral: Graphical Models via Generalized Linear Models »
Eunho Yang · Pradeep Ravikumar · Genevera I Allen · zhandong Liu -
2012 Poster: A Divide-and-Conquer Method for Sparse Inverse Covariance Estimation »
Cho-Jui Hsieh · Inderjit Dhillon · Pradeep Ravikumar · Arindam Banerjee -
2011 Workshop: Discrete Optimization in Machine Learning (DISCML): Uncertainty, Generalization and Feedback »
Andreas Krause · Pradeep Ravikumar · Stefanie S Jegelka · Jeffrey A Bilmes -
2011 Poster: On Learning Discrete Graphical Models using Greedy Methods »
Ali Jalali · Christopher C Johnson · Pradeep Ravikumar -
2011 Spotlight: On Learning Discrete Graphical Models using Greedy Methods »
Ali Jalali · Christopher C Johnson · Pradeep Ravikumar -
2011 Poster: Greedy Algorithms for Structurally Constrained High Dimensional Problems »
Ambuj Tewari · Pradeep Ravikumar · Inderjit Dhillon -
2011 Poster: Sparse Inverse Covariance Matrix Estimation Using Quadratic Approximation »
Cho-Jui Hsieh · Matyas A Sustik · Inderjit Dhillon · Pradeep Ravikumar -
2011 Session: Oral Session 5 »
Pradeep Ravikumar -
2011 Poster: Nearest Neighbor based Greedy Coordinate Descent »
Inderjit Dhillon · Pradeep Ravikumar · Ambuj Tewari -
2011 Poster: Orthogonal Matching Pursuit with Replacement »
Prateek Jain · Ambuj Tewari · Inderjit Dhillon -
2010 Workshop: Discrete Optimization in Machine Learning: Structures, Algorithms and Applications »
Andreas Krause · Pradeep Ravikumar · Jeffrey A Bilmes · Stefanie Jegelka -
2010 Workshop: Robust Statistical Learning »
Pradeep Ravikumar · Constantine Caramanis · Sujay Sanghavi -
2010 Session: Oral Session 14 »
Pradeep Ravikumar -
2010 Spotlight: Guaranteed Rank Minimization via Singular Value Projection »
Prateek Jain · Raghu Meka · Inderjit Dhillon -
2010 Poster: Guaranteed Rank Minimization via Singular Value Projection »
Prateek Jain · Raghu Meka · Inderjit Dhillon -
2010 Spotlight: Inductive Regularized Learning of Kernel Functions »
Prateek Jain · Brian Kulis · Inderjit Dhillon -
2010 Oral: A Dirty Model for Multi-task Learning »
Ali Jalali · Pradeep Ravikumar · Sujay Sanghavi · Chao Ruan -
2010 Poster: Inductive Regularized Learning of Kernel Functions »
Prateek Jain · Brian Kulis · Inderjit Dhillon -
2010 Poster: A Dirty Model for Multi-task Learning »
Ali Jalali · Pradeep Ravikumar · Sujay Sanghavi · Chao Ruan -
2009 Workshop: Discrete Optimization in Machine Learning: Submodularity, Polyhedra and Sparsity »
Andreas Krause · Pradeep Ravikumar · Jeffrey A Bilmes -
2009 Poster: Information-theoretic lower bounds on the oracle complexity of convex optimization »
Alekh Agarwal · Peter Bartlett · Pradeep Ravikumar · Martin J Wainwright -
2009 Spotlight: Information-theoretic lower bounds on the oracle complexity of convex optimization »
Alekh Agarwal · Peter Bartlett · Pradeep Ravikumar · Martin J Wainwright -
2009 Poster: A unified framework for high-dimensional analysis of $M$-estimators with decomposable regularizers »
Sahand N Negahban · Pradeep Ravikumar · Martin J Wainwright · Bin Yu -
2009 Oral: A unified framework for high-dimensional analysis of $M$-estimators with decomposable regularizers »
Sahand N Negahban · Pradeep Ravikumar · Martin J Wainwright · Bin Yu -
2009 Poster: Matrix Completion from Power-Law Distributed Samples »
Raghu Meka · Prateek Jain · Inderjit Dhillon -
2008 Poster: Nonparametric sparse hierarchical models describe V1 fMRI responses to natural images »
Pradeep Ravikumar · Vincent Vu · Bin Yu · Thomas Naselaris · Kendrick Kay · Jack Gallant -
2008 Spotlight: Nonparametric sparse hierarchical models describe V1 fMRI responses to natural images »
Pradeep Ravikumar · Vincent Vu · Bin Yu · Thomas Naselaris · Kendrick Kay · Jack Gallant -
2008 Poster: Online Metric Learning and Fast Similarity Search »
Prateek Jain · Brian Kulis · Inderjit Dhillon · Kristen Grauman -
2008 Oral: Online Metric Learning and Fast Similarity Search »
Prateek Jain · Brian Kulis · Inderjit Dhillon · Kristen Grauman -
2008 Poster: Model Selection in Gaussian Graphical Models: High-Dimensional Consistency of \ell_1-regularizedMLE »
Pradeep Ravikumar · Garvesh Raskutti · Martin J Wainwright · Bin Yu -
2007 Poster: SpAM: Sparse Additive Models »
Pradeep Ravikumar · Han Liu · John Lafferty · Larry Wasserman -
2007 Spotlight: SpAM: Sparse Additive Models »
Pradeep Ravikumar · Han Liu · John Lafferty · Larry Wasserman -
2006 Poster: Inferring Graphical Model Structure using $\ell_1$-Regularized Pseudo-Likelihood »
Martin J Wainwright · Pradeep Ravikumar · John Lafferty -
2006 Spotlight: Inferring Graphical Model Structure using $\ell_1$-Regularized Pseudo-Likelihood »
Martin J Wainwright · Pradeep Ravikumar · John Lafferty -
2006 Poster: Differential Entropic Clustering of Multivariate Gaussians »
Jason V Davis · Inderjit Dhillon