Timezone: »
Poster
Efficient Gradient Computation for Structured Output Learning with Rational and Tropical Losses
Corinna Cortes · Vitaly Kuznetsov · Mehryar Mohri · Dmitry Storcheus · Scott Yang
Many structured prediction problems admit a natural loss function for evaluation such as the edit-distance or $n$-gram loss. However, existing learning algorithms are typically designed to optimize alternative objectives such as the cross-entropy. This is because a na\"{i}ve implementation of the natural loss functions often results in intractable gradient computations. In this paper, we design efficient gradient computation algorithms for two broad families of structured prediction loss functions: rational and tropical losses. These families include as special cases the $n$-gram loss, the edit-distance loss, and many other loss functions commonly used in natural language processing and computational biology tasks that are based on sequence similarity measures. Our algorithms make use of weighted automata and graph operations over appropriate semirings to design efficient solutions. They facilitate efficient gradient computation and hence enable one to train learning models such as neural networks with complex structured losses.
Author Information
Corinna Cortes (Google Research)
Vitaly Kuznetsov (Google)
Mehryar Mohri (Courant Inst. of Math. Sciences & Google Research)
Dmitry Storcheus (Google Research)
Scott Yang (D. E. Shaw & Co.)
More from the Same Authors
-
2020 Poster: Adapting to Misspecification in Contextual Bandits »
Dylan Foster · Claudio Gentile · Mehryar Mohri · Julian Zimmert -
2020 Poster: Agnostic Learning with Multiple Objectives »
Corinna Cortes · Mehryar Mohri · Javier Gonzalvo · Dmitry Storcheus -
2020 Poster: Reinforcement Learning with Feedback Graphs »
Christoph Dann · Yishay Mansour · Mehryar Mohri · Ayush Sekhari · Karthik Sridharan -
2020 Poster: PAC-Bayes Learning Bounds for Sample-Dependent Priors »
Pranjal Awasthi · Satyen Kale · Stefani Karp · Mehryar Mohri -
2019 Poster: Learning GANs and Ensembles Using Discrepancy »
Ben Adlam · Corinna Cortes · Mehryar Mohri · Ningshan Zhang -
2019 Poster: Bandits with Feedback Graphs and Switching Costs »
Raman Arora · Teodor Vanislavov Marinov · Mehryar Mohri -
2019 Poster: Regularized Gradient Boosting »
Corinna Cortes · Mehryar Mohri · Dmitry Storcheus -
2019 Poster: Hypothesis Set Stability and Generalization »
Dylan Foster · Spencer Greenberg · Satyen Kale · Haipeng Luo · Mehryar Mohri · Karthik Sridharan -
2018 Poster: Policy Regret in Repeated Games »
Raman Arora · Michael Dinitz · Teodor Vanislavov Marinov · Mehryar Mohri -
2018 Poster: Algorithms and Theory for Multiple-Source Adaptation »
Judy Hoffman · Mehryar Mohri · Ningshan Zhang -
2017 Workshop: NIPS 2017 Time Series Workshop »
Vitaly Kuznetsov · Oren Anava · Scott Yang · Azadeh Khaleghi -
2017 Poster: Discriminative State Space Models »
Vitaly Kuznetsov · Mehryar Mohri -
2017 Poster: Online Learning with Transductive Regret »
Scott Yang · Mehryar Mohri -
2017 Poster: Parameter-Free Online Learning via Model Selection »
Dylan J Foster · Satyen Kale · Mehryar Mohri · Karthik Sridharan -
2017 Spotlight: Parameter-Free Online Learning via Model Selection »
Dylan J Foster · Satyen Kale · Mehryar Mohri · Karthik Sridharan -
2017 Spotlight: Online Learning with Transductive Regret »
Scott Yang · Mehryar Mohri -
2016 Poster: Structured Prediction Theory Based on Factor Graph Complexity »
Corinna Cortes · Vitaly Kuznetsov · Mehryar Mohri · Scott Yang -
2016 Poster: Boosting with Abstention »
Corinna Cortes · Giulia DeSalvo · Mehryar Mohri -
2016 Poster: Optimistic Bandit Convex Optimization »
Scott Yang · Mehryar Mohri -
2016 Tutorial: Theory and Algorithms for Forecasting Non-Stationary Time Series »
Vitaly Kuznetsov · Mehryar Mohri -
2015 Poster: Revenue Optimization against Strategic Buyers »
Mehryar Mohri · Andres Munoz -
2015 Poster: Learning Theory and Algorithms for Forecasting Non-stationary Time Series »
Vitaly Kuznetsov · Mehryar Mohri -
2015 Oral: Learning Theory and Algorithms for Forecasting Non-stationary Time Series »
Vitaly Kuznetsov · Mehryar Mohri -
2014 Workshop: Second Workshop on Transfer and Multi-Task Learning: Theory meets Practice »
Urun Dogan · Tatiana Tommasi · Yoshua Bengio · Francesco Orabona · Marius Kloft · Andres Munoz · Gunnar Rätsch · Hal Daumé III · Mehryar Mohri · Xuezhi Wang · Daniel Hernández-lobato · Song Liu · Thomas Unterthiner · Pascal Germain · Vinay P Namboodiri · Michael Goetz · Christopher Berlind · Sigurd Spieckermann · Marta Soare · Yujia Li · Vitaly Kuznetsov · Wenzhao Lian · Daniele Calandriello · Emilie Morvant -
2014 Workshop: NIPS Workshop on Transactional Machine Learning and E-Commerce »
David Parkes · David H Wolpert · Jennifer Wortman Vaughan · Jacob D Abernethy · Amos Storkey · Mark Reid · Ping Jin · Nihar Bhadresh Shah · Mehryar Mohri · Luis E Ortiz · Robin Hanson · Aaron Roth · Satyen Kale · Sebastien Lahaie -
2014 Poster: Optimal Regret Minimization in Posted-Price Auctions with Strategic Buyers »
Mehryar Mohri · Andres Munoz -
2014 Poster: Multi-Class Deep Boosting »
Vitaly Kuznetsov · Mehryar Mohri · Umar Syed -
2014 Spotlight: Optimal Regret Minimization in Posted-Price Auctions with Strategic Buyers »
Mehryar Mohri · Andres Munoz -
2014 Session: Oral Session 6 »
Mehryar Mohri -
2014 Poster: Conditional Swap Regret and Conditional Correlated Equilibrium »
Mehryar Mohri · Scott Yang -
2013 Poster: Learning Kernels Using Local Rademacher Complexity »
Corinna Cortes · Marius Kloft · Mehryar Mohri -
2013 Spotlight: Learning Kernels Using Local Rademacher Complexity »
Corinna Cortes · Marius Kloft · Mehryar Mohri -
2013 Session: Oral Session 6 »
Corinna Cortes -
2012 Poster: Accuracy at the Top »
Stephen Boyd · Corinna Cortes · Mehryar Mohri · Ana Radovanovic -
2012 Poster: Spectral Learning of General Weighted Automata via Constrained Matrix Completion »
Borja Balle · Mehryar Mohri -
2012 Oral: Spectral Learning of General Weighted Automata via Constrained Matrix Completion »
Borja Balle · Mehryar Mohri -
2011 Workshop: Domain Adaptation Workshop: Theory and Application »
John Blitzer · Corinna Cortes · Afshin Rostamizadeh -
2011 Workshop: Sparse Representation and Low-rank Approximation »
Ameet S Talwalkar · Lester W Mackey · Mehryar Mohri · Michael W Mahoney · Francis Bach · Mike E davies · Remi Gribonval · Guillaume R Obozinski -
2010 Workshop: Low-rank Methods for Large-scale Machine Learning »
Arthur Gretton · Michael W Mahoney · Mehryar Mohri · Ameet S Talwalkar -
2010 Poster: Learning Bounds for Importance Weighting »
Corinna Cortes · Yishay Mansour · Mehryar Mohri -
2009 Poster: Efficient Large-Scale Distributed Training of Conditional Maximum Entropy Models »
Gideon S Mann · Ryan McDonald · Mehryar Mohri · Nathan Silberman · Dan Walker -
2009 Poster: Ensemble Nystrom Method »
Sanjiv Kumar · Mehryar Mohri · Ameet S Talwalkar -
2009 Spotlight: Efficient Large-Scale Distributed Training of Conditional Maximum Entropy Models »
Gideon S Mann · Ryan McDonald · Mehryar Mohri · Nathan Silberman · Dan Walker -
2009 Poster: Learning Non-Linear Combinations of Kernels »
Corinna Cortes · Mehryar Mohri · Afshin Rostamizadeh -
2009 Poster: Polynomial Semantic Indexing »
Bing Bai · Jason E Weston · David Grangier · Ronan Collobert · Kunihiko Sadamasa · Yanjun Qi · Corinna Cortes · Mehryar Mohri -
2008 Workshop: Kernel Learning: Automatic Selection of Optimal Kernels »
Corinna Cortes · Arthur Gretton · Gert Lanckriet · Mehryar Mohri · Afshin Rostamizadeh -
2008 Poster: Domain Adaptation with Multiple Sources »
Yishay Mansour · Mehryar Mohri · Afshin Rostamizadeh -
2008 Spotlight: Domain Adaptation with Multiple Sources »
Yishay Mansour · Mehryar Mohri · Afshin Rostamizadeh -
2008 Poster: Rademacher Complexity Bounds for Non-I.I.D. Processes »
Mehryar Mohri · Afshin Rostamizadeh -
2007 Workshop: Efficient Machine Learning - Overcoming Computational Bottlenecks in Machine Learning (Part 2) »
Samy Bengio · Corinna Cortes · Dennis DeCoste · Francois Fleuret · Ramesh Natarajan · Edwin Pednault · Dan Pelleg · Elad Yom-Tov -
2007 Workshop: Efficient Machine Learning - Overcoming Computational Bottlenecks in Machine Learning (Part 1) »
Samy Bengio · Corinna Cortes · Dennis DeCoste · Francois Fleuret · Ramesh Natarajan · Edwin Pednault · Dan Pelleg · Elad Yom-Tov -
2007 Poster: Stability Bounds for Non-i.i.d. Processes »
Mehryar Mohri · Afshin Rostamizadeh -
2006 Poster: On Transductive Regression »
Corinna Cortes · Mehryar Mohri