Timezone: »
The natural world is abundant with underlying concepts expressed naturally in multiple heterogeneous sources such as the visual, acoustic, tactile, and linguistic modalities. Despite vast differences in these raw modalities, humans seamlessly perceive multimodal data, learn new concepts, and show extraordinary capabilities in generalizing across input modalities. Much of the existing progress in multimodal learning, however, focuses primarily on problems where the same set of modalities are present at train and test time, which makes learning in low-resource modalities particularly difficult. In this work, we propose a general algorithm for cross-modal generalization: a learning paradigm where data from more abundant source modalities is used to learn useful representations for scarce target modalities. Our algorithm is based on meta-alignment, a novel method to align representation spaces across modalities while ensuring quick generalization to new concepts. Experimental results on generalizing from image to audio classification and from text to speech classification demonstrate strong performance on classifying data from an entirely new target modality with only a few (1-10) labeled samples. In addition, our method works particularly well when the target modality suffers from noisy or limited labels, a scenario particularly prevalent in low-resource modalities.
Author Information
Paul Pu Liang (Carnegie Mellon University)
More from the Same Authors
-
2021 : MultiBench: Multiscale Benchmarks for Multimodal Representation Learning »
Paul Pu Liang · Yiwei Lyu · Xiang Fan · Zetian Wu · Yun Cheng · Jason Wu · Leslie (Yufan) Chen · Peter Wu · Michelle A. Lee · Yuke Zhu · Ruslan Salakhutdinov · Louis-Philippe Morency -
2022 : Learning More Effective Cell Representations Efficiently »
Jason Xiaotian Dou · Minxue Jia · Nika Zaslavsky · Haiyi Mao · Runxue Bao · Ni Ke · Paul Pu Liang · Zhi-Hong Mao -
2022 : MultiViz: Towards Visualizing and Understanding Multimodal Models »
Paul Pu Liang · · Gunjan Chhablani · Nihal Jain · Zihao Deng · Xingbo Wang · Louis-Philippe Morency · Ruslan Salakhutdinov -
2022 : Nano: Nested Human-in-the-Loop Reward Learning for Controlling Distribution of Generated Text »
Xiang Fan · · Paul Pu Liang · Ruslan Salakhutdinov · Louis-Philippe Morency -
2020 Workshop: First Workshop on Quantum Tensor Networks in Machine Learning »
Xiao-Yang Liu · Qibin Zhao · Jacob Biamonte · Cesar F Caiafa · Paul Pu Liang · Nadav Cohen · Stefan Leichenauer -
2019 : Extended Poster Session »
Travis LaCroix · Marie Ossenkopf · Mina Lee · Nicole Fitzgerald · Daniela Mihai · Jonathon Hare · Ali Zaidi · Alexander Cowen-Rivers · Alana Marzoev · Eugene Kharitonov · Luyao Yuan · Tomasz Korbak · Paul Pu Liang · Yi Ren · Roberto Dessì · Peter Potash · Shangmin Guo · Tatsunori Hashimoto · Percy Liang · Julian Zubek · Zipeng Fu · Song-Chun Zhu · Adam Lerer -
2019 Poster: Deep Gamblers: Learning to Abstain with Portfolio Theory »
Liu Ziyin · Zhikang Wang · Paul Pu Liang · Russ Salakhutdinov · Louis-Philippe Morency · Masahito Ueda -
2018 : Coffee break + posters 2 »
Jan Kremer · Erik McDermott · Brandon Carter · Albert Zeyer · Andreas Krug · Paul Pu Liang · Katherine Lee · Dominika Basaj · Abelino Jimenez · Lisa Fan · Gautam Bhattacharya · Tzeviya S Fuchs · David Gifford · Loren Lugosch · Orhan Firat · Benjamin Baer · JAHANGIR ALAM · Jamin Shin · Mirco Ravanelli · Paul Smolensky · Zining Zhu · Hamid Eghbal-zadeh · Skyler Seto · Imran Sheikh · Joao Felipe Santos · Yonatan Belinkov · Nadir Durrani · Oiwi Parker Jones · Shuai Tang · André Merboldt · Titouan Parcollet · Wei-Ning Hsu · Krishna Pillutla · Ehsan Hosseini-Asl · Monica Dinculescu · Alexander Amini · Ying Zhang · Taoli Cheng · Alain Tapp -
2018 : Modeling Spatiotemporal Multimodal Language with Recurrent Multistage Fusion »
Paul Pu Liang