Timezone: »
Text-to-Image generation in the general domain has long been an open problem, which requires both a powerful generative model and cross-modal understanding. We propose CogView, a 4-billion-parameter Transformer with VQ-VAE tokenizer to advance this problem. We also demonstrate the finetuning strategies for various downstream tasks, e.g. style learning, super-resolution, text-image ranking and fashion design, and methods to stabilize pretraining, e.g. eliminating NaN losses. CogView achieves the state-of-the-art FID on the blurred MS COCO dataset, outperforming previous GAN-based models and a recent similar work DALL-E.
Author Information
Ming Ding (Tsinghua University)
Zhuoyi Yang (Tsinghua University, Tsinghua University)
Wenyi Hong (Department of Computer Science and Technology, Tsinghua University)
Wendi Zheng (Tsinghua University)
Chang Zhou (Alibaba Group)
Da Yin (Tsinghua University, Tsinghua University)
Junyang Lin (Alibaba Group)
Xu Zou (Tsinghua University, Tsinghua University)
Zhou Shao (Beijing Academy of Artificial Intelligence)
Hongxia Yang (Alibaba Group)
Jie Tang (Tsinghua University)

Jie Tang is a WeBank Chair Professor of Computer Science at Tsinghua University. He is a Fellow of the ACM, a Fellow of AAAI, and a Fellow of IEEE. His interest is artificial general intelligence (AGI). His research received the SIGKDD Test-of-Time Award (10-year Best Paper). He also received the SIGKDD Service Award. Recently, he puts all efforts into Large Language Models (LLMs): GLM, ChatGLM, etc.
More from the Same Authors
-
2021 : Graph Robustness Benchmark: Benchmarking the Adversarial Robustness of Graph Machine Learning »
Qinkai Zheng · Xu Zou · Yuxiao Dong · Yukuo Cen · Da Yin · Jiarong Xu · YANG YANG · Jie Tang -
2023 Poster: ImageReward: Learning and Evaluating Human Preferences for Text-to-Image Generation »
Jiazheng Xu · Xiao Liu · Yuchen Wu · Yuxuan Tong · Qinkai Li · Ming Ding · Jie Tang · Yuxiao Dong -
2022 Poster: CogView2: Faster and Better Text-to-Image Generation via Hierarchical Transformers »
Ming Ding · Wendi Zheng · Wenyi Hong · Jie Tang -
2021 : Invited talk 3 »
Jie Tang -
2021 Poster: Adaptive Diffusion in Graph Neural Networks »
Jialin Zhao · Yuxiao Dong · Ming Ding · Evgeny Kharlamov · Jie Tang -
2021 Poster: UFC-BERT: Unifying Multi-Modal Controls for Conditional Image Synthesis »
Zhu Zhang · Jianxin Ma · Chang Zhou · Rui Men · Zhikang Li · Ming Ding · Jie Tang · Jingren Zhou · Hongxia Yang -
2021 Poster: A Hierarchical Reinforcement Learning Based Optimization Framework for Large-scale Dynamic Pickup and Delivery Problems »
Yi Ma · Xiaotian Hao · Jianye Hao · Jiawen Lu · Xing Liu · Tong Xialiang · Mingxuan Yuan · Zhigang Li · Jie Tang · Zhaopeng Meng -
2020 Poster: Graph Random Neural Networks for Semi-Supervised Learning on Graphs »
Wenzheng Feng · Jie Zhang · Yuxiao Dong · Yu Han · Huanbo Luan · Qian Xu · Qiang Yang · Evgeny Kharlamov · Jie Tang -
2020 Oral: Graph Random Neural Networks for Semi-Supervised Learning on Graphs »
Wenzheng Feng · Jie Zhang · Yuxiao Dong · Yu Han · Huanbo Luan · Qian Xu · Qiang Yang · Evgeny Kharlamov · Jie Tang -
2020 Poster: A Matrix Chernoff Bound for Markov Chains and Its Application to Co-occurrence Matrices »
Jiezhong Qiu · Chi Wang · Ben Liao · Richard Peng · Jie Tang -
2020 Poster: Counterfactual Prediction for Bundle Treatment »
Hao Zou · Peng Cui · Bo Li · Zheyan Shen · Jianxin Ma · Hongxia Yang · Yue He -
2020 Poster: CogLTX: Applying BERT to Long Texts »
Ming Ding · Chang Zhou · Hongxia Yang · Jie Tang -
2019 Poster: Learning Disentangled Representations for Recommendation »
Jianxin Ma · Chang Zhou · Peng Cui · Hongxia Yang · Wenwu Zhu -
2019 Poster: Understanding and Improving Layer Normalization »
Jingjing Xu · Xu Sun · Zhiyuan Zhang · Guangxiang Zhao · Junyang Lin -
2018 Poster: Bandit Learning with Implicit Feedback »
Yi Qi · Qingyun Wu · Hongning Wang · Jie Tang · Maosong Sun