Timezone: »
To build an artificial neural network like the biological intelligence system, recent works have unified numerous tasks into a generalist model, which can process various tasks with shared parameters and do not have any task-specific modules. While generalist models achieve promising results on various benchmarks, they have performance degradation on some tasks compared with task-specialized models. In this work, we find that interference among different tasks and modalities is the main factor to this phenomenon. To mitigate such interference, we introduce the Conditional Mixture-of-Experts (Conditional MoEs) to generalist models. Routing strategies under different levels of conditions are proposed to take both the training/inference cost and generalization ability into account. By incorporating the proposed Conditional MoEs, the recently proposed generalist model Uni-Perceiver can effectively mitigate the interference across tasks and modalities, and achieves state-of-the-art results on a series of downstream tasks via prompt tuning on 1% of downstream data. Moreover, the introduction of Conditional MoEs still holds the generalization ability of generalist models to conduct zero-shot inference on new tasks, e.g., videotext retrieval and video caption. Code and pre-trained generalist models are publicly released at https://github.com/fundamentalvision/Uni-Perceiver.
Author Information
Jinguo Zhu (Xi'an Jiaotong University)
Xizhou Zhu (Shanghai AI Laboratory)
Wenhai Wang (Nanjing University)
Xiaohua Wang (Xi'an Jiaotong University)
Hongsheng Li (The Chinese University of Hong Kong)
Xiaogang Wang (The Chinese University of Hong Kong)
Jifeng Dai (Tsinghua University)
More from the Same Authors
-
2022 Spotlight: Lightning Talks 4B-3 »
Zicheng Zhang · Mancheng Meng · Antoine Guedon · Yue Wu · Wei Mao · Zaiyu Huang · Peihao Chen · Shizhe Chen · yongwei chen · Keqiang Sun · Yi Zhu · chen rui · Hanhui Li · Dongyu Ji · Ziyan Wu · miaomiao Liu · Pascal Monasse · Yu Deng · Shangzhe Wu · Pierre-Louis Guhur · Jiaolong Yang · Kunyang Lin · Makarand Tapaswi · Zhaoyang Huang · Terrence Chen · Jiabao Lei · Jianzhuang Liu · Vincent Lepetit · Zhenyu Xie · Richard I Hartley · Dinggang Shen · Xiaodan Liang · Runhao Zeng · Cordelia Schmid · Michael Kampffmeyer · Mathieu Salzmann · Ning Zhang · Fangyun Wei · Yabin Zhang · Fan Yang · Qifeng Chen · Wei Ke · Quan Wang · Thomas Li · qingling Cai · Kui Jia · Ivan Laptev · Mingkui Tan · Xin Tong · Hongsheng Li · Xiaodan Liang · Chuang Gan -
2022 Spotlight: ST-Adapter: Parameter-Efficient Image-to-Video Transfer Learning »
Junting Pan · Ziyi Lin · Xiatian Zhu · Jing Shao · Hongsheng Li -
2022 Spotlight: Controllable 3D Face Synthesis with Conditional Generative Occupancy Fields »
Keqiang Sun · Shangzhe Wu · Zhaoyang Huang · Ning Zhang · Quan Wang · Hongsheng Li -
2022 Spotlight: Uni-Perceiver-MoE: Learning Sparse Generalist Models with Conditional MoEs »
Jinguo Zhu · Xizhou Zhu · Wenhai Wang · Xiaohua Wang · Hongsheng Li · Xiaogang Wang · Jifeng Dai -
2022 Spotlight: MCMAE: Masked Convolution Meets Masked Autoencoders »
Peng Gao · Teli Ma · Hongsheng Li · Ziyi Lin · Jifeng Dai · Yu Qiao -
2022 Spotlight: Lightning Talks 2B-1 »
Yehui Tang · Jian Wang · Zheng Chen · man zhou · Peng Gao · Chenyang Si · SHANGKUN SUN · Yixing Xu · Weihao Yu · Xinghao Chen · Kai Han · Hu Yu · Yulun Zhang · Chenhui Gou · Teli Ma · Yuanqi Chen · Yunhe Wang · Hongsheng Li · Jinjin Gu · Jianyuan Guo · Qiman Wu · Pan Zhou · Yu Zhu · Jie Huang · Chang Xu · Yichen Zhou · Haocheng Feng · Guodong Guo · yongbing zhang · Ziyi Lin · Feng Zhao · Ge Li · Junyu Han · Jinwei Gu · Jifeng Dai · Chao Xu · Xinchao Wang · Linghe Kong · Shuicheng Yan · Yu Qiao · Chen Change Loy · Xin Yuan · Errui Ding · Yunhe Wang · Deyu Meng · Jingdong Wang · Chongyi Li -
2022 Poster: Point-M2AE: Multi-scale Masked Autoencoders for Hierarchical Point Cloud Pre-training »
Renrui Zhang · Ziyu Guo · Peng Gao · Rongyao Fang · Bin Zhao · Dong Wang · Yu Qiao · Hongsheng Li -
2022 Poster: MCMAE: Masked Convolution Meets Masked Autoencoders »
Peng Gao · Teli Ma · Hongsheng Li · Ziyi Lin · Jifeng Dai · Yu Qiao -
2022 Poster: ST-Adapter: Parameter-Efficient Image-to-Video Transfer Learning »
Junting Pan · Ziyi Lin · Xiatian Zhu · Jing Shao · Hongsheng Li -
2022 Poster: Controllable 3D Face Synthesis with Conditional Generative Occupancy Fields »
Keqiang Sun · Shangzhe Wu · Zhaoyang Huang · Ning Zhang · Quan Wang · Hongsheng Li -
2021 Poster: DominoSearch: Find layer-wise fine-grained N:M sparse schemes from dense neural networks »
Wei Sun · Aojun Zhou · Sander Stuijk · Rob Wijnhoven · Andrew Nelson · Hongsheng Li · Henk Corporaal -
2021 Poster: Container: Context Aggregation Networks »
peng gao · Jiasen Lu · Hongsheng Li · Roozbeh Mottaghi · Aniruddha Kembhavi -
2021 Poster: ReSSL: Relational Self-Supervised Learning with Weak Augmentation »
Mingkai Zheng · Shan You · Fei Wang · Chen Qian · Changshui Zhang · Xiaogang Wang · Chang Xu -
2020 Poster: Generalized Focal Loss: Learning Qualified and Distributed Bounding Boxes for Dense Object Detection »
Xiang Li · Wenhai Wang · Lijun Wu · Shuo Chen · Xiaolin Hu · Jun Li · Jinhui Tang · Jian Yang -
2020 Poster: Self-paced Contrastive Learning with Hybrid Memory for Domain Adaptive Object Re-ID »
Yixiao Ge · Feng Zhu · Dapeng Chen · Rui Zhao · Hongsheng Li -
2020 Poster: Balanced Meta-Softmax for Long-Tailed Visual Recognition »
Jiawei Ren · Cunjun Yu · shunan sheng · Xiao Ma · Haiyu Zhao · Shuai Yi · Hongsheng Li -
2019 Poster: Learning to Predict Layout-to-image Conditional Convolutions for Semantic Image Synthesis »
Xihui Liu · Guojun Yin · Jing Shao · Xiaogang Wang · Hongsheng Li -
2019 Poster: PasteGAN: A Semi-Parametric Method to Generate Image from Scene Graph »
Yikang LI · Tao Ma · Yeqi Bai · Nan Duan · Sining Wei · Xiaogang Wang -
2018 Poster: FD-GAN: Pose-guided Feature Distilling GAN for Robust Person Re-identification »
Yixiao Ge · Zhuowan Li · Haiyu Zhao · Guojun Yin · Shuai Yi · Xiaogang Wang · Hongsheng Li -
2017 Poster: Learning Deep Structured Multi-Scale Features using Attention-Gated CRFs for Contour Prediction »
Dan Xu · Wanli Ouyang · Xavier Alameda-Pineda · Elisa Ricci · Xiaogang Wang · Nicu Sebe -
2016 Poster: CRF-CNN: Modeling Structured Information in Human Pose Estimation »
Xiao Chu · Wanli Ouyang · Hongsheng Li · Xiaogang Wang -
2014 Poster: Multi-View Perceptron: a Deep Model for Learning Face Identity and View Representations »
Zhenyao Zhu · Ping Luo · Xiaogang Wang · Xiaoou Tang -
2014 Poster: Deep Learning Face Representation by Joint Identification-Verification »
Yi Sun · Yuheng Chen · Xiaogang Wang · Xiaoou Tang