Timezone: »

Toronto Deep Learning
Jamie Kiros · Russ Salakhutdinov · Nitish Srivastava · Yichuan Charlie Tang

Tue Dec 09 04:00 PM -- 08:59 PM (PST) @ Level 2, room 230B
Event URL: http://deeplearning.cs.toronto.edu/ »

We demonstrate an interactive system for tagging, retrieving and generating sentence descriptions for images. Our models are based on learning a multimodal vector space using deep convolutional networks and long short-term memory (LSTM) recurrent networks for encoding images and sentences. A highly structured multimodal neural language model is used for decoding and generating image descriptions from scratch.

Alongside this, we will also showcase a mobile app where a user can take pictures with their phone (such as objects in the demonstration room) and have these images be classified in real time.

Author Information

Jamie Kiros (Google Brain)
Russ Salakhutdinov (Carnegie Mellon University)
Nitish Srivastava (Apple Inc)
Yichuan Charlie Tang (Apple Inc.)

More from the Same Authors