Timezone: »
Workshop Overview
Multimodal machine learning aims at building models that can process and relate information from multiple modalities. From the early research on audio-visual speech recognition to the recent explosion of interest in models mapping images to natural language, multimodal machine learning is is a vibrant multi-disciplinary field of increasing importance and with extraordinary potential.
Learning from paired multimodal sources offers the possibility of capturing correspondences between modalities and gain in-depth understanding of natural phenomena. Thus, multimodal data provides a means of reducing our dependence on the more standard supervised learning paradigm that is inherently limited by the availability of labeled examples.
This research field brings some unique challenges for machine learning researchers given the heterogeneity of the data and the complementarity often found between modalities. This workshop will facilitate the progress in multimodal machine learning by bringing together researchers from natural language processing, multimedia, computer vision, speech processing and machine learning to discuss the current challenges and identify the research infrastructure needed to enable a stronger multidisciplinary collaboration.
For keynote talk abstracts and MMML 2015 workshop proceedings:
https://sites.google.com/site/multiml2015/
Oral presentation
- Listen, Attend, and Walk: Neural Mapping of Navigational Instructions to Action Sequences - Hongyuan Mei, Mohit Bansal, Matthew Walter
Oral spotlights
- An Analysis-By-Synthesis Approach to Multisensory Object Shape Perception. Goker Erdogan, Ilker Yildirim, Robert Jacobs
- Active Perception based on Multimodal Hierarchical Dirichlet Processes. Tadahiro Taniguchi, Toshiaki Takano, Ryo Yoshino
- Towards Deep Alignment of Multimodal Data. George Trigeorgis, Mihalis Nicolaou, Stefanos Zafeiriou, Bjorn Schuller
- Multimodal Transfer Deep Learning with an Application in Audio-Visual Recognition. Seungwhan Moon, Suyoun Kim, Haohan Wang
Posters
- Multimodal Convolutional Neural Networks for Matching Image and Sentence. Lin Ma, Zhengdong Lu, Lifeng Shang, Hang Li
- Group sparse factorization of multiple data views. Eemeli Leppäaho, Samuel Kaski
- Unveiling the Dreams of Word Embeddings: Towards Language-Driven Image Generation. Angeliki Lazaridou, Dat Tien Nguyen, Raffaella Bernardi, Marco Baroni
- Cross-Modal Attribute Recognition in Fashion. Susana Zoghbi, Geert Heyman, Juan Carlos Gomez Carranza, Marie-Francine Moens
- Multimodal Sparse Coding for Event Detection. Youngjune Gwon, William Campbell, Kevin Brady, Douglas Sturim, Miriam Cha, H. T. Kung
- Multimodal Symbolic Association using Parallel Multilayer Perceptron. Federico Raue, Sebastian Palacio, Thomas Breuel, Wonmin Byeon, Andreas Dengel, Marcus Liwicki
- Bridge Correlational Neural Networks for Multilingual Multimodal Representation Learning. Janarthanan Rajendran, Mitesh Khapra, Sarath Chandar, Balaraman Ravindran
- Multimodal Learning of Object Concepts and Word Meanings by Robots. Tatsuya Aoki, Takayuki Nagai, Joe Nishihara, Tomoaki Nakamura, Muhammad Attamimi
- Multi-task, Multi-Kernel Learning for Estimating Individual Wellbeing. Natasha Jaques, Sara Taylor, Akane Sano, Rosalind Picard
- Generating Images from Captions with Attention. Elman Mansimov, Emilio Parisotto, Jimmy Ba, Ruslan Salakhutdinov
- Manifold Alignment Determination. Andreas Damianou, Neil Lawrence, Carl Henrik Ek
- Accelerating Multimodal Sequence Retrieval with Convolutional Networks. Colin Raffel, Daniel P. W. Ellis
- Audio-Visual Fusion for Noise Robust Speech Recognition. Nagasrikanth Kallakuri, Ian Lane
- Learning Multimodal Semantic Models for Image Question Answering. Zichao Yang, Xiaodong He, Jianfeng Gao, Li Deng
- Greedy Vector-valued Multi-view Learning. Hachem Kadri, Stephane Ayache, Cecile Capponi, François-Xavier Dupé
- S2VT: Sequence to Sequence -- Video to Text. Subhashini Venugopalan, Marcus Rohrbach, Jeff Donahue, Raymond Mooney, Trevor Darrell, Kate Saenko
Fri 6:00 a.m. - 6:15 a.m.
|
Introduction
(
Talk
)
|
Aaron Courville 🔗 |
Fri 6:15 a.m. - 7:00 a.m.
|
Visual Question Answering
(
Talk
)
|
Dhruv Batra 🔗 |
Fri 7:00 a.m. - 7:30 a.m.
|
Listen, Attend and Walk: Neural Mapping of Navigational Instructions to Action Sequences
(
Talk
)
|
Matthew Walter 🔗 |
Fri 7:30 a.m. - 8:00 a.m.
|
Accepted Orals and Spotlights
(
Spotlight
)
|
Seungwhan Moon · George Trigeorgis · Goker Erdogan · Tadahiro Taniguchi 🔗 |
Fri 8:00 a.m. - 8:30 a.m.
|
Multimodal Transfer Deep Learning with Applications in Audio-Visual Recognition
(
Talk
)
|
Seungwhan Moon 🔗 |
Fri 11:30 a.m. - 12:15 p.m.
|
Generating Natural-Language Video Descriptions using LSTM Recurrent Neural Networks
(
Talk
)
|
Raymond Mooney 🔗 |
Fri 12:15 p.m. - 1:00 p.m.
|
Cross-Modality Distant Supervised Learning for Speech, Text, and Image Classification
(
Talk
)
|
Li Deng 🔗 |
Fri 1:30 p.m. - 2:15 p.m.
|
Generating Images from Captions with Attention
(
Talk
)
|
Russ Salakhutdinov 🔗 |
Fri 2:15 p.m. - 3:00 p.m.
|
Automatic Cross-Media Event Schema Construction and Knowledge Population
(
Talk
)
|
Heng Ji 🔗 |
Author Information
Louis-Philippe Morency (Carnegie Mellon University)
Tadas Baltrusaitis (Carnegie Mellon University)
Aaron Courville (University of Montreal)
Kyunghyun Cho (NYU)
Kyunghyun Cho is an associate professor of computer science and data science at New York University and a research scientist at Facebook AI Research. He was a postdoctoral fellow at the Université de Montréal until summer 2015 under the supervision of Prof. Yoshua Bengio, and received PhD and MSc degrees from Aalto University early 2014 under the supervision of Prof. Juha Karhunen, Dr. Tapani Raiko and Dr. Alexander Ilin. He tries his best to find a balance among machine learning, natural language processing, and life, but almost always fails to do so.
More from the Same Authors
-
2021 : MultiBench: Multiscale Benchmarks for Multimodal Representation Learning »
Paul Pu Liang · Yiwei Lyu · Xiang Fan · Zetian Wu · Yun Cheng · Jason Wu · Leslie (Yufan) Chen · Peter Wu · Michelle A. Lee · Yuke Zhu · Ruslan Salakhutdinov · Louis-Philippe Morency -
2021 : NaturalProofs: Mathematical Theorem Proving in Natural Language »
Sean Welleck · Jiacheng Liu · Ronan Le Bras · Hanna Hajishirzi · Yejin Choi · Kyunghyun Cho -
2021 : KLUE: Korean Language Understanding Evaluation »
Sungjoon Park · Jihyung Moon · Sungdong Kim · Won Ik Cho · Ji Yoon Han · Jangwon Park · Chisung Song · Junseong Kim · Youngsook Song · Taehwan Oh · Joohong Lee · Juhyun Oh · Sungwon Lyu · Younghoon Jeong · Inkwon Lee · Sangwoo Seo · Dongjun Lee · Hyunwoo Kim · Myeonghwa Lee · Seongbo Jang · Seungwon Do · Sunkyoung Kim · Kyungtae Lim · Jongwon Lee · Kyumin Park · Jamin Shin · Seonghyun Kim · Lucy Park · Alice Oh · Jung-Woo Ha · Kyunghyun Cho -
2021 : Function-guided protein design by deep manifold sampling »
Vladimir Gligorijevic · Stephen Ra · Dan Berenberg · Richard Bonneau · Kyunghyun Cho -
2022 : A Pareto-optimal compositional energy-based model for sampling and optimization of protein sequences »
Nataša Tagasovska · Nathan Frey · Andreas Loukas · Isidro Hotzel · Julien Lafrance-Vanasse · Ryan Kelly · Yan Wu · Arvind Rajpal · Richard Bonneau · Kyunghyun Cho · Stephen Ra · Vladimir Gligorijevic -
2022 : PropertyDAG: Multi-objective Bayesian optimization of partially ordered, mixed-variable properties for biological sequence design »
Ji Won Park · Samuel Stanton · Saeed Saremi · Andrew Watkins · Stephen Ra · Vladimir Gligorijevic · Kyunghyun Cho · Richard Bonneau -
2022 : EquiFold: Protein Structure Prediction with a Novel Coarse-Grained Structure Representation »
Jae Hyeon Lee · Payman Yadollahpour · Andrew Watkins · Nathan Frey · Andrew Leaver-Fay · Stephen Ra · Vladimir Gligorijevic · Kyunghyun Cho · Aviv Regev · Richard Bonneau -
2022 : Mitigating input-causing confounding in multimodal learning via the backdoor adjustment »
Taro Makino · Krzysztof Geras · Kyunghyun Cho -
2022 : Learning Causal Representations of Single Cells via Sparse Mechanism Shift Modeling »
Romain Lopez · Nataša Tagasovska · Stephen Ra · Kyunghyun Cho · Jonathan Pritchard · Aviv Regev -
2022 : Unleashing The Potential of Data Sharing in Ensemble Deep Reinforcement Learning »
Zhixuan Lin · Pierluca D'Oro · Evgenii Nikishin · Aaron Courville -
2022 : Sample-Efficient Reinforcement Learning by Breaking the Replay Ratio Barrier »
Pierluca D'Oro · Max Schwarzer · Evgenii Nikishin · Pierre-Luc Bacon · Marc Bellemare · Aaron Courville -
2022 : Investigating Multi-task Pretraining and Generalization in Reinforcement Learning »
Adrien Ali Taiga · Rishabh Agarwal · Jesse Farebrother · Aaron Courville · Marc Bellemare -
2022 : MultiViz: Towards Visualizing and Understanding Multimodal Models »
Paul Pu Liang · · Gunjan Chhablani · Nihal Jain · Zihao Deng · Xingbo Wang · Louis-Philippe Morency · Ruslan Salakhutdinov -
2022 : Nano: Nested Human-in-the-Loop Reward Learning for Controlling Distribution of Generated Text »
Xiang Fan · · Paul Pu Liang · Ruslan Salakhutdinov · Louis-Philippe Morency -
2022 : EquiFold: Protein Structure Prediction with a Novel Coarse-Grained Structure Representation »
Jae Hyeon Lee · Payman Yadollahpour · Andrew Watkins · Nathan Frey · Andrew Leaver-Fay · Stephen Ra · Vladimir Gligorijevic · Kyunghyun Cho · Aviv Regev · Richard Bonneau -
2022 Workshop: Robustness in Sequence Modeling »
Nathan Ng · Haoran Zhang · Vinith Suriyakumar · Chantal Shaib · Kyunghyun Cho · Yixuan Li · Alice Oh · Marzyeh Ghassemi -
2022 Poster: Riemannian Diffusion Models »
Chin-Wei Huang · Milad Aghajohari · Joey Bose · Prakash Panangaden · Aaron Courville -
2022 Poster: Generative multitask learning mitigates target-causing confounding »
Taro Makino · Krzysztof Geras · Kyunghyun Cho -
2022 Poster: Reincarnating Reinforcement Learning: Reusing Prior Computation to Accelerate Progress »
Rishabh Agarwal · Max Schwarzer · Pablo Samuel Castro · Aaron Courville · Marc Bellemare -
2022 Poster: Paraphrasing Is All You Need for Novel Object Captioning »
Cheng-Fu Yang · Yao-Hung Hubert Tsai · Wan-Cyuan Fan · Russ Salakhutdinov · Louis-Philippe Morency · Frank Wang -
2021 : Invited talk 6 »
Louis-Philippe Morency -
2021 : Function-guided protein design by deep manifold sampling »
Vladimir Gligorijevic · Stephen Ra · Dan Berenberg · Richard Bonneau · Kyunghyun Cho -
2021 : NaturalProofs: Mathematical Theorem Proving in Natural Language »
Sean Welleck · Jiacheng Liu · Ronan Le Bras · Hanna Hajishirzi · Yejin Choi · Kyunghyun Cho -
2021 Poster: True Few-Shot Learning with Language Models »
Ethan Perez · Douwe Kiela · Kyunghyun Cho -
2020 Poster: Neural Methods for Point-wise Dependency Estimation »
Yao-Hung Hubert Tsai · Han Zhao · Makoto Yamada · Louis-Philippe Morency · Russ Salakhutdinov -
2020 Spotlight: Neural Methods for Point-wise Dependency Estimation »
Yao-Hung Hubert Tsai · Han Zhao · Makoto Yamada · Louis-Philippe Morency · Russ Salakhutdinov -
2019 : Poster Session »
Pravish Sainath · Mohamed Akrout · Charles Delahunt · Nathan Kutz · Guangyu Robert Yang · Joseph Marino · L F Abbott · Nicolas Vecoven · Damien Ernst · andrew warrington · Michael Kagan · Kyunghyun Cho · Kameron Harris · Leopold Grinberg · John J. Hopfield · Dmitry Krotov · Taliah Muhammad · Erick Cobos · Edgar Walker · Jacob Reimer · Andreas Tolias · Alexander Ecker · Janaki Sheth · Yu Zhang · Maciej Wołczyk · Jacek Tabor · Szymon Maszke · Roman Pogodin · Dane Corneil · Wulfram Gerstner · Baihan Lin · Guillermo Cecchi · Jenna M Reinen · Irina Rish · Guillaume Bellec · Darjan Salaj · Anand Subramoney · Wolfgang Maass · Yueqi Wang · Ari Pakman · Jin Hyung Lee · Liam Paninski · Bryan Tripp · Colin Graber · Alex Schwing · Luke Prince · Gabriel Ocker · Michael Buice · Benjamin Lansdell · Konrad Kording · Jack Lindsey · Terrence Sejnowski · Matthew Farrell · Eric Shea-Brown · Nicolas Farrugia · Victor Nepveu · Jiwoong Im · Kristin Branson · Brian Hu · Ramakrishnan Iyer · Stefan Mihalas · Sneha Aenugu · Hananel Hazan · Sihui Dai · Tan Nguyen · Doris Tsao · Richard Baraniuk · Anima Anandkumar · Hidenori Tanaka · Aran Nayebi · Stephen Baccus · Surya Ganguli · Dean Pospisil · Eilif Muller · Jeffrey S Cheng · Gaël Varoquaux · Kamalaker Dadi · Dimitrios C Gklezakos · Rajesh PN Rao · Anand Louis · Christos Papadimitriou · Santosh Vempala · Naganand Yadati · Daniel Zdeblick · Daniela M Witten · Nicholas Roberts · Vinay Prabhu · Pierre Bellec · Poornima Ramesh · Jakob H Macke · Santiago Cadena · Guillaume Bellec · Franz Scherr · Owen Marschall · Robert Kim · Hannes Rapp · Marcio Fonseca · Oliver Armitage · Jiwoong Im · Thomas Hardcastle · Abhishek Sharma · Wyeth Bair · Adrian Valente · Shane Shang · Merav Stern · Rutuja Patil · Peter Wang · Sruthi Gorantla · Peter Stratton · Tristan Edwards · Jialin Lu · Martin Ester · Yurii Vlasov · Siavash Golkar -
2019 Workshop: Emergent Communication: Towards Natural Language »
Abhinav Gupta · Michael Noukhovitch · Cinjon Resnick · Natasha Jaques · Angelos Filos · Marie Ossenkopf · Angeliki Lazaridou · Jakob Foerster · Ryan Lowe · Douwe Kiela · Kyunghyun Cho -
2019 Workshop: Context and Compositionality in Biological and Artificial Neural Systems »
Javier Turek · Shailee Jain · Alexander Huth · Leila Wehbe · Emma Strubell · Alan Yuille · Tal Linzen · Christopher Honey · Kyunghyun Cho -
2019 : Panel Discussion »
Linda Smith · Josh Tenenbaum · Lisa Anne Hendricks · James McClelland · Timothy Lillicrap · Jesse Thomason · Jason Baldridge · Louis-Philippe Morency -
2019 : Louis-Philippe Morency »
Louis-Philippe Morency -
2019 Poster: Can Unconditional Language Models Recover Arbitrary Sentences? »
Nishant Subramani · Samuel Bowman · Kyunghyun Cho -
2019 Poster: Deep Gamblers: Learning to Abstain with Portfolio Theory »
Liu Ziyin · Zhikang Wang · Paul Pu Liang · Russ Salakhutdinov · Louis-Philippe Morency · Masahito Ueda -
2019 Tutorial: Imitation Learning and its Application to Natural Language Generation »
Kyunghyun Cho · Hal Daumé III -
2018 Workshop: Emergent Communication Workshop »
Jakob Foerster · Angeliki Lazaridou · Ryan Lowe · Igor Mordatch · Douwe Kiela · Kyunghyun Cho -
2018 : Poster Session 1 »
Stefan Gadatsch · Danil Kuzin · Navneet Kumar · Patrick Dallaire · Tom Ryder · Remus-Petru Pop · Nathan Hunt · Adam Kortylewski · Sophie Burkhardt · Mahmoud Elnaggar · Dieterich Lawson · Yifeng Li · Jongha (Jon) Ryu · Juhan Bae · Micha Livne · Tim Pearce · Mariia Vladimirova · Jason Ramapuram · Jiaming Zeng · Xinyu Hu · Jiawei He · Danielle Maddix · Arunesh Mittal · Albert Shaw · Tuan Anh Le · Alexander Sagel · Lisha Chen · Victor Gallego · Mahdi Karami · Zihao Zhang · Tal Kachman · Noah Weber · Matt Benatan · Kumar K Sricharan · Vincent Cartillier · Ivan Ovinnikov · Buu Phan · Mahmoud Hossam · Liu Ziyin · Valerii Kharitonov · Eugene Golikov · Qiang Zhang · Jae Myung Kim · Sebastian Farquhar · Jishnu Mukhoti · Xu Hu · Gregory Gundersen · Lavanya Sita Tekumalla · Paris Perdikaris · Ershad Banijamali · Siddhartha Jain · Ge Liu · Martin Gottwald · Katy Blumer · Sukmin Yun · Ranganath Krishnan · Roman Novak · Yilun Du · Yu Gong · Beliz Gokkaya · Jessica Ai · Daniel Duckworth · Johannes von Oswald · Christian Henning · Louis-Philippe Morency · Ali Ghodsi · Mahesh Subedar · Jean-Pascal Pfister · Rémi Lebret · Chao Ma · Aleksander Wieczorek · Laurence Perreault Levasseur -
2018 Poster: Loss Functions for Multiset Prediction »
Sean Welleck · Zixin Yao · Yu Gai · Jialin Mao · Zheng Zhang · Kyunghyun Cho -
2018 Poster: Speaker-Follower Models for Vision-and-Language Navigation »
Daniel Fried · Ronghang Hu · Volkan Cirik · Anna Rohrbach · Jacob Andreas · Louis-Philippe Morency · Taylor Berg-Kirkpatrick · Kate Saenko · Dan Klein · Trevor Darrell -
2017 Workshop: Emergent Communication Workshop »
Jakob Foerster · Igor Mordatch · Angeliki Lazaridou · Kyunghyun Cho · Douwe Kiela · Pieter Abbeel -
2017 Poster: Saliency-based Sequential Image Attention with Multiset Prediction »
Sean Welleck · Jialin Mao · Kyunghyun Cho · Zheng Zhang -
2016 : Discussion panel »
Ian Goodfellow · Soumith Chintala · Arthur Gretton · Sebastian Nowozin · Aaron Courville · Yann LeCun · Emily Denton -
2016 : Adversarially Learned Inference (ALI) and BiGANs »
Aaron Courville -
2016 Poster: End-to-End Goal-Driven Web Navigation »
Rodrigo Nogueira · Kyunghyun Cho -
2016 Poster: Iterative Refinement of the Approximate Posterior for Directed Belief Networks »
R Devon Hjelm · Russ Salakhutdinov · Kyunghyun Cho · Nebojsa Jojic · Vince Calhoun · Junyoung Chung -
2015 Poster: Attention-Based Models for Speech Recognition »
Jan K Chorowski · Dzmitry Bahdanau · Dmitriy Serdyuk · Kyunghyun Cho · Yoshua Bengio -
2015 Spotlight: Attention-Based Models for Speech Recognition »
Jan K Chorowski · Dzmitry Bahdanau · Dmitriy Serdyuk · Kyunghyun Cho · Yoshua Bengio -
2014 Poster: Identifying and attacking the saddle point problem in high-dimensional non-convex optimization »
Yann N Dauphin · Razvan Pascanu · Caglar Gulcehre · Kyunghyun Cho · Surya Ganguli · Yoshua Bengio -
2014 Poster: Generative Adversarial Nets »
Ian Goodfellow · Jean Pouget-Abadie · Mehdi Mirza · Bing Xu · David Warde-Farley · Sherjil Ozair · Aaron Courville · Yoshua Bengio -
2014 Poster: On the Number of Linear Regions of Deep Neural Networks »
Guido F Montufar · Razvan Pascanu · Kyunghyun Cho · Yoshua Bengio -
2014 Demonstration: Neural Machine Translation »
Bart van Merriënboer · Kyunghyun Cho · Dzmitry Bahdanau · Yoshua Bengio -
2014 Poster: Iterative Neural Autoregressive Distribution Estimator NADE-k »
Tapani Raiko · Yao Li · Kyunghyun Cho · Yoshua Bengio -
2013 Poster: Multi-Prediction Deep Boltzmann Machines »
Ian Goodfellow · Mehdi Mirza · Aaron Courville · Yoshua Bengio -
2011 Poster: On Tracking The Partition Function »
Guillaume Desjardins · Aaron Courville · Yoshua Bengio