NeurIPS Poster Trading off Consistency and Dimensionality of Convex Surrogates for Multiclass Classification

Poster

Trading off Consistency and Dimensionality of Convex Surrogates for Multiclass Classification

Enrique Nueve · Dhamma Kimpara · Bo Waggoner · Jessica Finocchiaro

East Exhibit Hall A-C #4405

[ Abstract ]

[ Paper] [ Poster] [ OpenReview]

Abstract: In multiclass classification over $n$ outcomes, we typically optimize some surrogate loss $L: \mathbb{R}^d \times\mathcal{Y} \to \mathbb{R}$ assigning real-valued error to predictions in $\mathbb{R}^d$. In this paradigm, outcomes must be embedded into the reals with dimension $d \approx n$ in order to design a consistent surrogate loss. Consistent losses are well-motivated theoretically, yet for large $n$, such as in information retrieval and structured prediction tasks, their optimization may be computationally infeasible. In practice, outcomes are typically embedded into some $\mathbb{R}^d$ for $d \ll n$, with little known about their suitability for multiclass classification. We investigate two approaches for trading off consistency and dimensionality in multiclass classification while using a convex surrogate loss. We first formalize partial consistency when the optimized surrogate has dimension $d \ll n$. We then check if partial consistency holds under a given embedding and low-noise assumption, providing insight into when to use a particular embedding into $\mathbb{R}^d$. Finally, we present a new method to construct (fully) consistent losses with $d \ll n$ out of multiple problem instances. Our practical approach leverages parallelism to sidestep lower bounds on $d$.

Chat is not available.