Skip to yearly menu bar Skip to main content


( events)   Timezone:  
Workshop
Thu Dec 08 11:00 PM -- 09:30 AM (PST) @ Room 111
Extreme Classification: Multi-class and Multi-label Learning in Extremely Large Label Spaces
Moustapha Cisse · Manik Varma · Samy Bengio
[ Video





Workshop Home Page

Extreme classification, where one needs to deal with multi-class and multi-label problems involving a very large number of labels, has opened up a new research frontier in machine learning. Many challenging applications, such as photo or video annotation, web page categorization, gene function prediction, language modeling can benefit from being formulated as supervised learning tasks with millions, or even billions, of labels. Extreme classification can also give a fresh perspective on core learning problems such as ranking and recommendation by reformulating them as multi-class/label tasks where each item to be ranked or recommended is a separate label.

Extreme classification raises a number of interesting research questions including those related to:

* Large scale learning and distributed and parallel training
* Log-time and log-space prediction and prediction on a test-time budget
* Label embedding and tree-based approaches
* Crowd sourcing, preference elicitation and other data gathering techniques
* Bandits, semi-supervised learning and other approaches for dealing with training set biases and label noise
* Bandits with an extremely large number of arms
* Fine-grained classification
* Zero shot learning and extensible output spaces
* Tackling label polysemy, synonymy and correlations
* Structured output prediction and multi-task learning
* Learning from highly imbalanced data
* Dealing with tail labels and learning from very few data points per label
* PU learning and learning from missing and incorrect labels
* Feature extraction, feature sharing, lazy feature evaluation, etc.
* Performance evaluation
* Statistical analysis and generalization bounds
* Applications to ranking, recommendation, knowledge graph construction and other domains

The workshop aims to bring together researchers interested in these areas to encourage discussion and improve upon the state-of-the-art in extreme classification. In particular, we aim to bring together researchers from the natural language processing, computer vision and core machine learning communities to foster interaction and collaboration. Several leading researchers will present invited talks detailing the latest advances in the area. We also seek extended abstracts presenting work in progress which will be reviewed for acceptance as spotlight+poster or a talk. The workshop should be of interest to researchers in core supervised learning as well as application domains such as recommender systems, computer vision, computational advertising, information retrieval and natural language processing. We expect a healthy participation from both industry and academia.

http://www.manikvarma.org/events/XC16/schedule.html

Opening Remarks by Manik, Moustapha & Samy (Talk)
Label Ranking with Biased Partial Feedback (Keynote)
Distributed Optimization of Multi-Class SVMs (Spotlight)
DiSMEC - Distributed Sparse Machines for Extreme Multi-label Classification (Spotlight)
A Primal and Dual Sparse Approach to Extreme Classification (Keynote)
Extreme Multi-label Loss Functions for Tagging, Ranking & Recommendation (Keynote)
Log-time and Log-space Extreme Classification (Spotlight)
Extreme Classification with Label Features (Spotlight)
Dual Decomposed Learning with Factorwise Oracles for Structural SVMs of Large Output Domain (Spotlight)
Lunch (Break)
Semi-supervised dimension reduction for large numbers of classes (Keynote)
A Theoretical Framework for Structured Prediction using Factor Graph Complexity (Spotlight)
Deep Schatten Networks (Keynote)
Regret Bounds for Non-decomposable Metrics with Missing Labels (Spotlight)
Training neural networks in time independent of output layer size (Keynote)
Coffee Break (Break)
Efficient softmax approximation for GPUs (Spotlight)
Pointer Sentinel Mixture Models (Spotlight)
A Simple but Tough-to-Beat Baseline for Sentence Embeddings (Spotlight)
Break
iCaRL: incremental classifier and representation learning (Keynote)
Is a picture worth a thousand words? a Deep Multi Modal Product Classification Architecture for e-commerce (Spotlight)
Learning to Solve Vision without Annotating Millions of Images (Keynote)