Timezone: »
Pre-training vision-language models with contrastive objectives has shown promising results that are both scalable to large uncurated datasets and transferable to many downstream applications. Some following works have targeted to improve data efficiency by adding self-supervision terms, but inter-domain (image-text) contrastive loss and intra-domain (image-image) contrastive loss are defined on individual spaces in those works, so many feasible combinations of supervision are overlooked. To overcome this issue, we propose UniCLIP, a Unified framework for Contrastive Language-Image Pre-training. UniCLIP integrates the contrastive loss of both inter-domain pairs and intra-domain pairs into a single universal space. The discrepancies that occur when integrating contrastive loss between different domains are resolved by the three key components of UniCLIP: (1) augmentation-aware feature embedding, (2) MP-NCE loss, and (3) domain dependent similarity measure. UniCLIP outperforms previous vision-language pre-training methods on various single- and multi-modality downstream tasks. In our experiments, we show that each component that comprises UniCLIP contributes well to the final performance.
Author Information
Janghyeon Lee (LG AI Research)
Jongsuk Kim (KAIST)
Hyounguk Shon (Korea Advanced Institute of Science and Technology)
Bumsoo Kim (LG AI Research)
Seung Hwan Kim (LG AI Research)
Honglak Lee (LG AI Research / U. Michigan)
Junmo Kim (KAIST)
More from the Same Authors
-
2021 : Learning Action Translator for Meta Reinforcement Learning on Sparse-Reward Tasks »
Yijie Guo · Qiucheng Wu · Honglak Lee -
2021 : Fast Inference and Transfer of Compositional Task for Few-shot Task Generalization »
Sungryull Sohn · Hyunjae Woo · Jongwook Choi · Izzeddin Gur · Aleksandra Faust · Honglak Lee -
2021 : Learning Parameterized Task Structure for Generalization to Unseen Entities »
Anthony Liu · Sungryull Sohn · Honglak Lee -
2021 : SURF: Semi-supervised Reward Learning with Data Augmentation for Feedback-efficient Preference-based Reinforcement Learning »
Jongjin Park · Younggyo Seo · Jinwoo Shin · Honglak Lee · Pieter Abbeel · Kimin Lee -
2021 : Learning compositional tasks from language instructions »
Lajanugen Logeswaran · Wilka Carvalho · Honglak Lee -
2022 : Allele-conditional attention mechanism for HLA-peptide complex binding affinity prediction »
Rodrigo Hormazabal · Doyeong Hwang · Kiyoung Kim · Sehui Han · Kyunghoon Bae · Honglak Lee -
2022 : Dynamics-Augmented Decision Transformer for Offline Dynamics Generalization »
Changyeon Kim · Junsu Kim · Younggyo Seo · Kimin Lee · Honglak Lee · Jinwoo Shin -
2022 : Learning Exploration Policies with View-based Intrinsic Rewards »
Yijie Guo · Yao Fu · Run Peng · Honglak Lee -
2022 : Training Time Adversarial Attack Aiming the Vulnerability of Continual Learning »
Gyojin Han · Jaehyun Choi · HyeongGwon Hong · Junmo Kim -
2022 : ReSPack: A Large-Scale Rectilinear Steiner Tree Packing Data Generator and Benchmark »
Kanghoon Lee · Youngjoon Park · Han-Seul Jeong · Deunsol Yoon · Sunghoon Hong · Sungryull Sohn · Minu Kim · Hanbum Ko · Moontae Lee · Honglak Lee · Kyunghoon Kim · Euihyuk Kim · Seonggeon Cho · Jaesang Min · Woohyung Lim -
2022 Poster: Transferring Pre-trained Multimodal Representations with Cross-modal Similarity Matching »
Byoungjip Kim · Sungik Choi · Dasol Hwang · Moontae Lee · Honglak Lee -
2022 Poster: Pure Transformers are Powerful Graph Learners »
Jinwoo Kim · Dat Nguyen · Seonwoo Min · Sungjun Cho · Moontae Lee · Honglak Lee · Seunghoon Hong -
2022 Poster: OpenSRH: optimizing brain tumor surgery using intraoperative stimulated Raman histology »
Cheng Jiang · Asadur Chowdury · Xinhai Hou · Akhil Kondepudi · Christian Freudiger · Kyle Conway · Sandra Camelo-Piragua · Daniel Orringer · Honglak Lee · Todd Hollon -
2022 Poster: Transformers meet Stochastic Block Models: Attention with Data-Adaptive Sparsity and Cost »
Sungjun Cho · Seonwoo Min · Jinwoo Kim · Moontae Lee · Honglak Lee · Seunghoon Hong -
2022 Poster: CEDe: A collection of expert-curated datasets with atom-level entity annotations for Optical Chemical Structure Recognition »
Rodrigo Hormazabal · Changyoung Park · Soonyoung Lee · Sehui Han · Yeonsik Jo · Jaewan Lee · Ahra Jo · Seung Hwan Kim · Jaegul Choo · Moontae Lee · Honglak Lee -
2022 Expo Talk Panel: Towards learning agents for solving complex real-world tasks »
Honglak Lee -
2021 Poster: Why Do Better Loss Functions Lead to Less Transferable Features? »
Simon Kornblith · Ting Chen · Honglak Lee · Mohammad Norouzi -
2021 Poster: Improving Transferability of Representations via Augmentation-Aware Self-Supervision »
Hankook Lee · Kibok Lee · Kimin Lee · Honglak Lee · Jinwoo Shin -
2021 Poster: Successor Feature Landmarks for Long-Horizon Goal-Conditioned Reinforcement Learning »
Christopher Hoang · Sungryull Sohn · Jongwook Choi · Wilka Carvalho · Honglak Lee -
2021 Poster: Environment Generation for Zero-Shot Compositional Reinforcement Learning »
Izzeddin Gur · Natasha Jaques · Yingjie Miao · Jongwook Choi · Manoj Tiwari · Honglak Lee · Aleksandra Faust -
2020 Poster: Memory Based Trajectory-conditioned Policies for Learning from Sparse Rewards »
Yijie Guo · Jongwook Choi · Marcin Moczulski · Shengyu Feng · Samy Bengio · Mohammad Norouzi · Honglak Lee -
2020 Poster: Bridging Imagination and Reality for Model-Based Deep Reinforcement Learning »
Guangxiang Zhu · Minghao Zhang · Honglak Lee · Chongjie Zhang -
2019 Poster: High Fidelity Video Prediction with Large Stochastic Recurrent Neural Networks »
Ruben Villegas · Arkanath Pathak · Harini Kannan · Dumitru Erhan · Quoc V Le · Honglak Lee -
2018 Poster: Constructing Fast Network through Deconstruction of Convolution »
Yunho Jeon · Junmo Kim -
2018 Poster: A Simple Unified Framework for Detecting Out-of-Distribution Samples and Adversarial Attacks »
Kimin Lee · Kibok Lee · Honglak Lee · Jinwoo Shin -
2018 Spotlight: A Simple Unified Framework for Detecting Out-of-Distribution Samples and Adversarial Attacks »
Kimin Lee · Kibok Lee · Honglak Lee · Jinwoo Shin -
2018 Spotlight: Constructing Fast Network through Deconstruction of Convolution »
Yunho Jeon · Junmo Kim -
2018 Poster: Hierarchical Reinforcement Learning for Zero-shot Generalization with Subtask Dependencies »
Sungryull Sohn · Junhyuk Oh · Honglak Lee -
2018 Poster: Learning Hierarchical Semantic Image Manipulation through Structured Representations »
Seunghoon Hong · Xinchen Yan · Thomas Huang · Honglak Lee -
2017 : Invited Talk 5 »
Honglak Lee -
2017 Workshop: Learning Disentangled Features: from Perception to Control »
Emily Denton · Siddharth Narayanaswamy · Tejas Kulkarni · Honglak Lee · Diane Bouchacourt · Josh Tenenbaum · David Pfau -
2017 Poster: Value Prediction Network »
Junhyuk Oh · Satinder Singh · Honglak Lee -
2016 Poster: Perspective Transformer Nets: Learning Single-View 3D Object Reconstruction without 3D Supervision »
Xinchen Yan · Jimei Yang · Ersin Yumer · Yijie Guo · Honglak Lee -
2016 Poster: Learning What and Where to Draw »
Scott E Reed · Zeynep Akata · Santosh Mohan · Samuel Tenka · Bernt Schiele · Honglak Lee -
2016 Oral: Learning What and Where to Draw »
Scott E Reed · Zeynep Akata · Santosh Mohan · Samuel Tenka · Bernt Schiele · Honglak Lee -
2015 : Deep Learning for Real-Time Atari Game Play Using Offline Monte-Carlo Tree Search Planning »
Honglak Lee -
2015 Symposium: Deep Learning Symposium »
Yoshua Bengio · Marc'Aurelio Ranzato · Honglak Lee · Max Welling · Andrew Y Ng -
2015 Poster: Deep Visual Analogy-Making »
Scott E Reed · Yi Zhang · Yuting Zhang · Honglak Lee -
2015 Poster: Action-Conditional Video Prediction using Deep Networks in Atari Games »
Junhyuk Oh · Xiaoxiao Guo · Honglak Lee · Richard L Lewis · Satinder Singh -
2015 Spotlight: Action-Conditional Video Prediction using Deep Networks in Atari Games »
Junhyuk Oh · Xiaoxiao Guo · Honglak Lee · Richard L Lewis · Satinder Singh -
2015 Oral: Deep Visual Analogy-Making »
Scott E Reed · Yi Zhang · Yuting Zhang · Honglak Lee -
2015 Poster: Learning Structured Output Representation using Deep Conditional Generative Models »
Kihyuk Sohn · Honglak Lee · Xinchen Yan -
2015 Poster: Weakly-supervised Disentangling with Recurrent Transformations for 3D View Synthesis »
Jimei Yang · Scott E Reed · Ming-Hsuan Yang · Honglak Lee -
2014 Workshop: Representation and Learning Methods for Complex Outputs »
Richard Zemel · Dale Schuurmans · Kilian Q Weinberger · Yuhong Guo · Jia Deng · Francesco Dinuzzo · Hal Daumé III · Honglak Lee · Noah A Smith · Richard Sutton · Jiaqian YU · Vitaly Kuznetsov · Luke Vilnis · Hanchen Xiong · Calvin Murdock · Thomas Unterthiner · Jean-Francis Roy · Martin Renqiang Min · Hichem SAHBI · Fabio Massimo Zanzotto -
2014 Poster: Deep Learning for Real-Time Atari Game Play Using Offline Monte-Carlo Tree Search Planning »
Xiaoxiao Guo · Satinder Singh · Honglak Lee · Richard L Lewis · Xiaoshi Wang -
2014 Poster: Improved Multimodal Deep Learning with Variation of Information »
Kihyuk Sohn · Wenling Shang · Honglak Lee -
2013 Poster: Robust Image Denoising with Multi-Column Deep Neural Networks »
Forest Agostinelli · Michael R Anderson · Honglak Lee -
2012 Poster: Learning to Align from Scratch »
Gary B Huang · Marwan A Mattar · Honglak Lee · Erik Learned-Miller -
2010 Workshop: Deep Learning and Unsupervised Feature Learning »
Honglak Lee · Marc'Aurelio Ranzato · Yoshua Bengio · Geoffrey E Hinton · Yann LeCun · Andrew Y Ng -
2009 Poster: Unsupervised feature learning for audio classification using convolutional deep belief networks »
Honglak Lee · Peter Pham · Yan Largman · Andrew Y Ng -
2007 Poster: Sparse deep belief net model for visual area V2 »
Honglak Lee · Ekanadham Chaitanya · Andrew Y Ng -
2006 Poster: Efficient sparse coding algorithms, end-stopping and nCRF surround suppression »
Honglak Lee · Alexis Battle · Raina Rajat · Andrew Y Ng