Skip to yearly menu bar Skip to main content


A Minimax Approach to Supervised Learning

Farzan Farnia · David Tse

Area 5+6+7+8 #140

Keywords: [ Learning Theory ] [ (Other) Statistics ] [ Information Theory ]


Given a task of predicting Y from X, a loss function L, and a set of probability distributions Gamma on (X,Y), what is the optimal decision rule minimizing the worst-case expected loss over Gamma? In this paper, we address this question by introducing a generalization of the maximum entropy principle. Applying this principle to sets of distributions with marginal on X constrained to be the empirical marginal, we provide a minimax interpretation of the maximum likelihood problem over generalized linear models, which connects the minimax problem for each loss function to a generalized linear model. While in some cases such as quadratic and logarithmic loss functions we revisit well-known linear and logistic regression models, our approach reveals novel models for other loss functions. In particular, for the 0-1 loss we derive a classification approach which we call the minimax SVM. The minimax SVM minimizes the worst-case expected 0-1 loss over the proposed Gamma by solving a tractable optimization problem. We perform several numerical experiments in all of which the minimax SVM outperforms the SVM.

Live content is unavailable. Log in and register to view live content