Timezone: »
Video-language modeling has attracted much attention with the rapid growth of web videos. Most existing methods assume that the video frames and text description are semantically correlated, and focus on video-language modeling at video level. However, this hypothesis often fails for two reasons: (1) With the rich semantics of video contents, it is difficult to cover all frames with a single video-level description; (2) A raw video typically has noisy/meaningless information (e.g., scenery shot, transition or teaser). Although a number of recent works deploy attention mechanism to alleviate this problem, the irrelevant/noisy information still makes it very difficult to address. To overcome such challenge, we thus propose an efficient and effective model, termed Language-Guided Denoising Network (LGDN), for video-language modeling. Different from most existing methods that utilize all extracted video frames, LGDN dynamically filters out the misaligned or redundant frames under the language supervision and obtains only 2--4 salient frames per video for cross-modal token-level alignment. Extensive experiments on five public datasets show that our LGDN outperforms the state-of-the-arts by large margins. We also provide detailed ablation study to reveal the critical importance of solving the noise issue, in hope of inspiring future video-language work.
Author Information
Haoyu Lu (Renmin University of China)
Mingyu Ding (The University of Hong Kong)
Nanyi Fei (Renmin University of China)
Yuqi Huo (Renmin University of China)
Zhiwu Lu (Renmin University of China)
Related Events (a corresponding poster, oral, or spotlight)
-
2022 Poster: LGDN: Language-Guided Denoising Network for Video-Language Modeling »
Dates n/a. Room
More from the Same Authors
-
2022 Poster: Fine-Grained Analysis of Stability and Generalization for Modern Meta Learning Algorithms »
Jiechao Guan · Yong Liu · Zhiwu Lu -
2022 Poster: BMU-MoCo: Bidirectional Momentum Update for Continual Video-Language Modeling »
Yizhao Gao · Nanyi Fei · Haoyu Lu · Zhiwu Lu · Hao Jiang · Yijie Li · Zhao Cao -
2022 : Planning with Large Language Models for Code Generation »
Shun Zhang · Zhenfang Chen · Yikang Shen · Mingyu Ding · Josh Tenenbaum · Chuang Gan -
2022 Spotlight: Lightning Talks 6A-2 »
Yichuan Mo · Botao Yu · Gang Li · Zezhong Xu · Haoran Wei · Arsene Fansi Tchango · Raef Bassily · Haoyu Lu · Qi Zhang · Songming Liu · Mingyu Ding · Peiling Lu · Yifei Wang · Xiang Li · Dongxian Wu · Ping Guo · Wen Zhang · Hao Zhongkai · Mehryar Mohri · Rishab Goel · Yisen Wang · Yifei Wang · Yangguang Zhu · Zhi Wen · Ananda Theertha Suresh · Chengyang Ying · Yujie Wang · Peng Ye · Rui Wang · Nanyi Fei · Hui Chen · Yiwen Guo · Wei Hu · Chenglong Liu · Julien Martel · Yuqi Huo · Wu Yichao · Hang Su · Yisen Wang · Peng Wang · Huajun Chen · Xu Tan · Jun Zhu · Ding Liang · Zhiwu Lu · Joumana Ghosn · Shanshan Zhang · Wei Ye · Ze Cheng · Shikun Zhang · Tao Qin · Tie-Yan Liu -
2022 Spotlight: BMU-MoCo: Bidirectional Momentum Update for Continual Video-Language Modeling »
Yizhao Gao · Nanyi Fei · Haoyu Lu · Zhiwu Lu · Hao Jiang · Yijie Li · Zhao Cao -
2022 Spotlight: Lightning Talks 6A-1 »
Ziyi Wang · Nian Liu · Yaming Yang · Qilong Wang · Yuanxin Liu · Zongxin Yang · Yizhao Gao · Yanchen Deng · Dongze Lian · Nanyi Fei · Ziyu Guan · Xiao Wang · Shufeng Kong · Xumin Yu · Daquan Zhou · Yi Yang · Fandong Meng · Mingze Gao · Caihua Liu · Yongming Rao · Zheng Lin · Haoyu Lu · Zhe Wang · Jiashi Feng · Zhaolin Zhang · Deyu Bo · Xinchao Wang · Chuan Shi · Jiangnan Li · Jiangtao Xie · Jie Zhou · Zhiwu Lu · Wei Zhao · Bo An · Jiwen Lu · Peihua Li · Jian Pei · Hao Jiang · Cai Xu · Peng Fu · Qinghua Hu · Yijie Li · Weigang Lu · Yanan Cao · Jianbin Huang · Weiping Wang · Zhao Cao · Jie Zhou -
2022 Spotlight: Fine-Grained Analysis of Stability and Generalization for Modern Meta Learning Algorithms »
Jiechao Guan · Yong Liu · Zhiwu Lu -
2022 Spotlight: Lightning Talks 4A-1 »
Jiawei Huang · Su Jia · Abdurakhmon Sadiev · Ruomin Huang · Yuanyu Wan · Denizalp Goktas · Jiechao Guan · Andrew Li · Wei-Wei Tu · Li Zhao · Amy Greenwald · Jiawei Huang · Dmitry Kovalev · Yong Liu · Wenjie Liu · Peter Richtarik · Lijun Zhang · Zhiwu Lu · R Ravi · Tao Qin · Wei Chen · Hu Ding · Nan Jiang · Tie-Yan Liu -
2021 Poster: Dynamic Visual Reasoning by Learning Differentiable Physics Models from Video and Language »
Mingyu Ding · Zhenfang Chen · Tao Du · Ping Luo · Josh Tenenbaum · Chuang Gan -
2021 Poster: Compressed Video Contrastive Learning »
Yuqi Huo · Mingyu Ding · Haoyu Lu · Nanyi Fei · Zhiwu Lu · Ji-Rong Wen · Ping Luo -
2018 Poster: Domain-Invariant Projection Learning for Zero-Shot Recognition »
An Zhao · Mingyu Ding · Jiechao Guan · Zhiwu Lu · Tao Xiang · Ji-Rong Wen