Timezone: »
We investigate a low-rank model of quadratic classification inspired by previous work on factorization machines, polynomial networks, and capsule-based architectures for visual object recognition. The model is parameterized by a pair of affine transformations, and it classifies examples by comparing the magnitudes of vectors that these transformations produce. The model is also over-parameterized in the sense that different pairs of affine transformations can describe classifiers with the same decision boundary and confidence scores. We show that such pairs arise from discrete and continuous symmetries of the model’s parameter space: in particular, the latter define symmetry groups of rotations and Lorentz transformations, and we use these group structures to devise appropriately invariant procedures for model alignment and averaging. We also leverage the form of the model’s decision boundary to derive simple margin-based updates for online learning. Here we explore a strategy of passive-aggressive learning: for each example, we compute the minimum change in parameters that is required to predict its correct label with high confidence. We derive these updates by solving a quadratically constrained quadratic program (QCQP); interestingly, this QCQP is nonconvex but tractable, and it can be solved efficiently by elementary methods. We highlight the conceptual and practical contributions of this approach. Conceptually, we show that it extends the paradigm of passive-aggressive learning to a larger family of nonlinear models for classification. Practically, we show that these models perform well on large-scale problems in online learning.
Author Information
Lawrence Saul (UC San Diego)
More from the Same Authors
-
2012 Poster: Latent Coincidence Analysis: A Hidden Variable Model for Distance Metric Learning »
Matthew F Der · Lawrence Saul -
2011 Poster: Maximum Covariance Unfolding : Manifold Learning for Bimodal Data »
Vijay Mahadevan · Chi Wah Wong · Jose Costa Pereira · Tom Liu · Nuno Vasconcelos · Lawrence Saul -
2010 Talk: Manifold Learning »
Lawrence Saul -
2010 Poster: Latent Variable Models for Predicting File Dependencies in Large-Scale Software Development »
Diane Hu · Laurens van der Maaten · Youngmin Cho · Lawrence Saul · Sorin Lerner -
2009 Poster: Kernel Methods for Deep Learning »
Youngmin Cho · Lawrence Saul -
2006 Poster: Large Margin Gaussian Mixture Models for Automatic Speech Recognition »
Fei Sha · Lawrence Saul -
2006 Talk: Large Margin Gaussian Mixture Models for Automatic Speech Recognition »
Fei Sha · Lawrence Saul -
2006 Poster: Graph Regularization for Maximum Variance Unfolding with an Application to Sensor Localization »
Kilian Q Weinberger · Fei Sha · Qihui Zhu · Lawrence Saul