Factorization machines and polynomial networks are supervised polynomial models based on an efficient low-rank decomposition. We extend these models to the multi-output setting, i.e., for learning vector-valued functions, with application to multi-class or multi-task problems. We cast this as the problem of learning a 3-way tensor whose slices share a common basis and propose a convex formulation of that problem. We then develop an efficient conditional gradient algorithm and prove its global convergence, despite the fact that it involves a non-convex basis selection step. On classification tasks, we show that our algorithm achieves excellent accuracy with much sparser models than existing methods. On recommendation system tasks, we show how to combine our algorithm with a reduction from ordinal regression to multi-output classification and show that the resulting algorithm outperforms simple baselines in terms of ranking accuracy.
Mathieu Blondel (NTT)
Research scientist at NTT CS Labs.
Vlad Niculae (Cornell University)
Takuma Otsuka (NTT Communication Science Labs)
Naonori Ueda (NTT Communication Science Laboratories / RIKEN AIP)
More from the Same Authors
2017 Poster: A Regularized Framework for Sparse and Structured Neural Attention »
Vlad Niculae · Mathieu Blondel
2016 Poster: Higher-Order Factorization Machines »
Mathieu Blondel · Akinori Fujino · Naonori Ueda · Masakazu Ishihata