Timezone: »
We study consistency properties of algorithms for non-decomposable performance measures that cannot be expressed as a sum of losses on individual data points, such as the F-measure used in text retrieval and several other performance measures used in class imbalanced settings. While there has been much work on designing algorithms for such performance measures, there is limited understanding of the theoretical properties of these algorithms. Recently, Ye et al. (2012) showed consistency results for two algorithms that optimize the F-measure, but their results apply only to an idealized setting, where precise knowledge of the underlying probability distribution (in the form of the true' posterior class probability) is available to a learning algorithm. In this work, we consider plug-in algorithms that learn a classifier by applying an empirically determined threshold to a suitable
estimate' of the class probability, and provide a general methodology to show consistency of these methods for any non-decomposable measure that can be expressed as a continuous function of true positive rate (TPR) and true negative rate (TNR), and for which the Bayes optimal classifier is the class probability function thresholded suitably. We use this template to derive consistency results for plug-in algorithms for the F-measure and for the geometric mean of TPR and precision; to our knowledge, these are the first such results for these measures. In addition, for continuous distributions, we show consistency of plug-in algorithms for any performance measure that is a continuous and monotonically increasing function of TPR and TNR. Experimental results confirm our theoretical findings.
Author Information
Harikrishna Narasimhan (Google AI)
Rohit Vaish (Indian Insititute of Science)
Shivani Agarwal (University of Pennsylvania)
More from the Same Authors
-
2016 Poster: Dueling Bandits: Beyond Condorcet Winners to General Tournament Solutions »
Siddartha Ramamohan · Arun Rajkumar · Shivani Agarwal · Shivani Agarwal -
2014 Workshop: Analysis of Rank Data: Confluence of Social Choice, Operations Research, and Machine Learning »
Shivani Agarwal · Hossein Azari Soufiani · Guy Bresler · Sewoong Oh · David Parkes · Arun Rajkumar · Devavrat Shah -
2014 Poster: Online and Stochastic Gradient Methods for Non-decomposable Loss Functions »
Purushottam Kar · Harikrishna Narasimhan · Prateek Jain -
2014 Poster: Online Decision-Making in General Combinatorial Spaces »
Arun Rajkumar · Shivani Agarwal -
2013 Poster: Convex Calibrated Surrogates for Low-Rank Loss Matrices with Applications to Subset Ranking Losses »
Harish G Ramaswamy · Shivani Agarwal · Ambuj Tewari -
2013 Poster: On the Relationship Between Binary Classification, Bipartite Ranking, and Binary Class Probability Estimation »
Harikrishna Narasimhan · Shivani Agarwal -
2013 Spotlight: On the Relationship Between Binary Classification, Bipartite Ranking, and Binary Class Probability Estimation »
Harikrishna Narasimhan · Shivani Agarwal -
2013 Spotlight: Convex Calibrated Surrogates for Low-Rank Loss Matrices with Applications to Subset Ranking Losses »
Harish G Ramaswamy · Shivani Agarwal · Ambuj Tewari -
2012 Poster: Classification Calibration Dimension for General Multiclass Losses »
Harish G Ramaswamy · Shivani Agarwal -
2012 Spotlight: Classification Calibration Dimension for General Multiclass Losses »
Harish G Ramaswamy · Shivani Agarwal -
2009 Workshop: Advances in Ranking »
Shivani Agarwal · Chris J Burges · Yacov Crammer