Timezone: »
Poster
Efficient Large-Scale Distributed Training of Conditional Maximum Entropy Models
Gideon S Mann · Ryan McDonald · Mehryar Mohri · Nathan Silberman · Dan Walker
Training conditional maximum entropy models on massive data requires significant time and computational resources. In this paper, we investigate three common distributed training strategies: distributed gradient, majority voting ensembles, and parameter mixtures. We analyze the worst-case runtime and resource costs of each and present a theoretical foundation for the convergence of parameters under parameter mixtures, the most efficient strategy. We present large-scale experiments comparing the different strategies and demonstrate that parameter mixtures over independent models use fewer resources and achieve comparable loss as compared to standard approaches.
Author Information
Gideon S Mann (Google Inc.)
Ryan McDonald (Google)
Mehryar Mohri (Google Research & Courant Institute of Mathematical Sciences)
Nathan Silberman
Dan Walker
Related Events (a corresponding poster, oral, or spotlight)
-
2009 Spotlight: Efficient Large-Scale Distributed Training of Conditional Maximum Entropy Models »
Wed. Dec 9th 11:25 -- 11:26 PM Room
More from the Same Authors
-
2021 Spotlight: Agnostic Reinforcement Learning with Low-Rank MDPs and Rich Observations »
Ayush Sekhari · Christoph Dann · Mehryar Mohri · Yishay Mansour · Karthik Sridharan -
2021 Spotlight: On the Existence of The Adversarial Bayes Classifier »
Pranjal Awasthi · Natalie Frank · Mehryar Mohri -
2021 Spotlight: Beyond Value-Function Gaps: Improved Instance-Dependent Regret Bounds for Episodic Reinforcement Learning »
Christoph Dann · Teodor Vanislavov Marinov · Mehryar Mohri · Julian Zimmert -
2021 Spotlight: Calibration and Consistency of Adversarial Surrogate Losses »
Pranjal Awasthi · Natalie Frank · Anqi Mao · Mehryar Mohri · Yutao Zhong -
2022 : A Theory of Learning with Competing Objectives and User Feedback »
Pranjal Awasthi · Corinna Cortes · Yishay Mansour · Mehryar Mohri -
2022 : AdaME: Adaptive learning of multisource adaptationensembles »
Scott Yak · Javier Gonzalvo · Mehryar Mohri · Corinna Cortes -
2022 : A Theory of Learning with Competing Objectives and User Feedback »
Pranjal Awasthi · Corinna Cortes · Yishay Mansour · Mehryar Mohri -
2023 Poster: $H$-Consistency Bounds: Characterization and Extensions »
Anqi Mao · Mehryar Mohri · Yutao Zhong -
2023 Poster: Structured Prediction with Stronger Consistency Guarantees »
Anqi Mao · Mehryar Mohri · Yutao Zhong -
2023 Poster: Two-Stage Learning to Defer with Multiple Experts »
Anqi Mao · Mehryar Mohri · Yutao Zhong -
2022 Spotlight: Lightning Talks 6A-2 »
Yichuan Mo · Botao Yu · Gang Li · Zezhong Xu · Haoran Wei · Arsene Fansi Tchango · Raef Bassily · Haoyu Lu · Qi Zhang · Songming Liu · Mingyu Ding · Peiling Lu · Yifei Wang · Xiang Li · Dongxian Wu · Ping Guo · Wen Zhang · Hao Zhongkai · Mehryar Mohri · Rishab Goel · Yisen Wang · Yifei Wang · Yangguang Zhu · Zhi Wen · Ananda Theertha Suresh · Chengyang Ying · Yujie Wang · Peng Ye · Rui Wang · Nanyi Fei · Hui Chen · Yiwen Guo · Wei Hu · Chenglong Liu · Julien Martel · Yuqi Huo · Wu Yichao · Hang Su · Yisen Wang · Peng Wang · Huajun Chen · Xu Tan · Jun Zhu · Ding Liang · Zhiwu Lu · Joumana Ghosn · Shanshan Zhang · Wei Ye · Ze Cheng · Shikun Zhang · Tao Qin · Tie-Yan Liu -
2022 Spotlight: Differentially Private Learning with Margin Guarantees »
Raef Bassily · Mehryar Mohri · Ananda Theertha Suresh -
2022 : A Theory of Learning with Competing Objectives and User Feedback »
Pranjal Awasthi · Corinna Cortes · Yishay Mansour · Mehryar Mohri -
2022 : Invited Talk #1, Differentially Private Learning with Margin Guarantees, Mehryar Mohri »
Mehryar Mohri -
2022 Poster: Multi-Class $H$-Consistency Bounds »
Pranjal Awasthi · Anqi Mao · Mehryar Mohri · Yutao Zhong -
2022 Poster: Stochastic Online Learning with Feedback Graphs: Finite-Time and Asymptotic Optimality »
Teodor Vanislavov Marinov · Mehryar Mohri · Julian Zimmert -
2022 Poster: Differentially Private Learning with Margin Guarantees »
Raef Bassily · Mehryar Mohri · Ananda Theertha Suresh -
2021 Poster: A Provably Efficient Model-Free Posterior Sampling Method for Episodic Reinforcement Learning »
Christoph Dann · Mehryar Mohri · Tong Zhang · Julian Zimmert -
2021 Poster: On the Existence of The Adversarial Bayes Classifier »
Pranjal Awasthi · Natalie Frank · Mehryar Mohri -
2021 Poster: Beyond Value-Function Gaps: Improved Instance-Dependent Regret Bounds for Episodic Reinforcement Learning »
Christoph Dann · Teodor Vanislavov Marinov · Mehryar Mohri · Julian Zimmert -
2021 Poster: Learning with User-Level Privacy »
Daniel Levy · Ziteng Sun · Kareem Amin · Satyen Kale · Alex Kulesza · Mehryar Mohri · Ananda Theertha Suresh -
2021 Poster: Boosting with Multiple Sources »
Corinna Cortes · Mehryar Mohri · Dmitry Storcheus · Ananda Theertha Suresh -
2021 Poster: Breaking the centralized barrier for cross-device federated learning »
Sai Praneeth Karimireddy · Martin Jaggi · Satyen Kale · Mehryar Mohri · Sashank Reddi · Sebastian Stich · Ananda Theertha Suresh -
2021 Poster: Agnostic Reinforcement Learning with Low-Rank MDPs and Rich Observations »
Ayush Sekhari · Christoph Dann · Mehryar Mohri · Yishay Mansour · Karthik Sridharan -
2021 Poster: Calibration and Consistency of Adversarial Surrogate Losses »
Pranjal Awasthi · Natalie Frank · Anqi Mao · Mehryar Mohri · Yutao Zhong -
2020 Poster: Adapting to Misspecification in Contextual Bandits »
Dylan Foster · Claudio Gentile · Mehryar Mohri · Julian Zimmert -
2020 Poster: Agnostic Learning with Multiple Objectives »
Corinna Cortes · Mehryar Mohri · Javier Gonzalvo · Dmitry Storcheus -
2020 Poster: Reinforcement Learning with Feedback Graphs »
Christoph Dann · Yishay Mansour · Mehryar Mohri · Ayush Sekhari · Karthik Sridharan -
2020 Poster: PAC-Bayes Learning Bounds for Sample-Dependent Priors »
Pranjal Awasthi · Satyen Kale · Stefani Karp · Mehryar Mohri -
2019 : Mehryar Mohri, "Learning with Sample-Dependent Hypothesis Sets" »
Mehryar Mohri -
2019 Poster: Learning GANs and Ensembles Using Discrepancy »
Ben Adlam · Corinna Cortes · Mehryar Mohri · Ningshan Zhang -
2019 Poster: Bandits with Feedback Graphs and Switching Costs »
Raman Arora · Teodor Vanislavov Marinov · Mehryar Mohri -
2019 Poster: Regularized Gradient Boosting »
Corinna Cortes · Mehryar Mohri · Dmitry Storcheus -
2019 Poster: Hypothesis Set Stability and Generalization »
Dylan Foster · Spencer Greenberg · Satyen Kale · Haipeng Luo · Mehryar Mohri · Karthik Sridharan -
2018 Poster: Policy Regret in Repeated Games »
Raman Arora · Michael Dinitz · Teodor Vanislavov Marinov · Mehryar Mohri -
2018 Poster: Efficient Gradient Computation for Structured Output Learning with Rational and Tropical Losses »
Corinna Cortes · Vitaly Kuznetsov · Mehryar Mohri · Dmitry Storcheus · Scott Yang -
2018 Poster: Algorithms and Theory for Multiple-Source Adaptation »
Judy Hoffman · Mehryar Mohri · Ningshan Zhang -
2017 : Mehryar Mohri (NYU) on Tight Learning Bounds for Multi-Class Classification »
Mehryar Mohri -
2017 : (Invited Talk) Mehryar Mohri: Regret minimization against strategic buyers. »
Mehryar Mohri -
2017 Poster: Discriminative State Space Models »
Vitaly Kuznetsov · Mehryar Mohri -
2017 Poster: Online Learning with Transductive Regret »
Scott Yang · Mehryar Mohri -
2017 Poster: Parameter-Free Online Learning via Model Selection »
Dylan J Foster · Satyen Kale · Mehryar Mohri · Karthik Sridharan -
2017 Spotlight: Parameter-Free Online Learning via Model Selection »
Dylan J Foster · Satyen Kale · Mehryar Mohri · Karthik Sridharan -
2017 Spotlight: Online Learning with Transductive Regret »
Scott Yang · Mehryar Mohri -
2016 Poster: Structured Prediction Theory Based on Factor Graph Complexity »
Corinna Cortes · Vitaly Kuznetsov · Mehryar Mohri · Scott Yang -
2016 Poster: Domain Separation Networks »
Konstantinos Bousmalis · George Trigeorgis · Nathan Silberman · Dilip Krishnan · Dumitru Erhan -
2016 Poster: Boosting with Abstention »
Corinna Cortes · Giulia DeSalvo · Mehryar Mohri -
2016 Poster: Optimistic Bandit Convex Optimization »
Scott Yang · Mehryar Mohri -
2016 Tutorial: Theory and Algorithms for Forecasting Non-Stationary Time Series »
Vitaly Kuznetsov · Mehryar Mohri -
2015 : A Theory of Multiple Source Adaptation »
Mehryar Mohri -
2015 : Learning Theory and Algorithms for Time Series »
Mehryar Mohri -
2015 Poster: Revenue Optimization against Strategic Buyers »
Mehryar Mohri · Andres Munoz -
2015 Poster: Learning Theory and Algorithms for Forecasting Non-stationary Time Series »
Vitaly Kuznetsov · Mehryar Mohri -
2015 Oral: Learning Theory and Algorithms for Forecasting Non-stationary Time Series »
Vitaly Kuznetsov · Mehryar Mohri -
2014 Workshop: Second Workshop on Transfer and Multi-Task Learning: Theory meets Practice »
Urun Dogan · Tatiana Tommasi · Yoshua Bengio · Francesco Orabona · Marius Kloft · Andres Munoz · Gunnar Rätsch · Hal Daumé III · Mehryar Mohri · Xuezhi Wang · Daniel Hernández-lobato · Song Liu · Thomas Unterthiner · Pascal Germain · Vinay P Namboodiri · Michael Goetz · Christopher Berlind · Sigurd Spieckermann · Marta Soare · Yujia Li · Vitaly Kuznetsov · Wenzhao Lian · Daniele Calandriello · Emilie Morvant -
2014 Workshop: NIPS Workshop on Transactional Machine Learning and E-Commerce »
David Parkes · David H Wolpert · Jennifer Wortman Vaughan · Jacob D Abernethy · Amos Storkey · Mark Reid · Ping Jin · Nihar Bhadresh Shah · Mehryar Mohri · Luis E Ortiz · Robin Hanson · Aaron Roth · Satyen Kale · Sebastien Lahaie -
2014 Poster: Optimal Regret Minimization in Posted-Price Auctions with Strategic Buyers »
Mehryar Mohri · Andres Munoz -
2014 Poster: Multi-Class Deep Boosting »
Vitaly Kuznetsov · Mehryar Mohri · Umar Syed -
2014 Spotlight: Optimal Regret Minimization in Posted-Price Auctions with Strategic Buyers »
Mehryar Mohri · Andres Munoz -
2014 Session: Oral Session 6 »
Mehryar Mohri -
2014 Poster: Conditional Swap Regret and Conditional Correlated Equilibrium »
Mehryar Mohri · Scott Yang -
2013 Poster: Learning Kernels Using Local Rademacher Complexity »
Corinna Cortes · Marius Kloft · Mehryar Mohri -
2013 Spotlight: Learning Kernels Using Local Rademacher Complexity »
Corinna Cortes · Marius Kloft · Mehryar Mohri -
2012 Poster: Accuracy at the Top »
Stephen Boyd · Corinna Cortes · Mehryar Mohri · Ana Radovanovic -
2012 Poster: Spectral Learning of General Weighted Automata via Constrained Matrix Completion »
Borja Balle · Mehryar Mohri -
2012 Oral: Spectral Learning of General Weighted Automata via Constrained Matrix Completion »
Borja Balle · Mehryar Mohri -
2011 Workshop: Sparse Representation and Low-rank Approximation »
Ameet S Talwalkar · Lester W Mackey · Mehryar Mohri · Michael W Mahoney · Francis Bach · Mike Davies · Remi Gribonval · Guillaume R Obozinski -
2010 Workshop: Low-rank Methods for Large-scale Machine Learning »
Arthur Gretton · Michael W Mahoney · Mehryar Mohri · Ameet S Talwalkar -
2010 Poster: Learning Bounds for Importance Weighting »
Corinna Cortes · Yishay Mansour · Mehryar Mohri -
2009 Poster: Ensemble Nystrom Method »
Sanjiv Kumar · Mehryar Mohri · Ameet S Talwalkar -
2009 Poster: Learning Non-Linear Combinations of Kernels »
Corinna Cortes · Mehryar Mohri · Afshin Rostamizadeh -
2009 Poster: Polynomial Semantic Indexing »
Bing Bai · Jason E Weston · David Grangier · Ronan Collobert · Kunihiko Sadamasa · Yanjun Qi · Corinna Cortes · Mehryar Mohri -
2008 Workshop: Kernel Learning: Automatic Selection of Optimal Kernels »
Corinna Cortes · Arthur Gretton · Gert Lanckriet · Mehryar Mohri · Afshin Rostamizadeh -
2008 Poster: Domain Adaptation with Multiple Sources »
Yishay Mansour · Mehryar Mohri · Afshin Rostamizadeh -
2008 Spotlight: Domain Adaptation with Multiple Sources »
Yishay Mansour · Mehryar Mohri · Afshin Rostamizadeh -
2008 Poster: Rademacher Complexity Bounds for Non-I.I.D. Processes »
Mehryar Mohri · Afshin Rostamizadeh -
2007 Poster: Stability Bounds for Non-i.i.d. Processes »
Mehryar Mohri · Afshin Rostamizadeh -
2006 Poster: On Transductive Regression »
Corinna Cortes · Mehryar Mohri