Timezone: »
We are now witnessing significant progress of deep learning methods in a variety of tasks (or datasets) of proteins. However, there is a lack of a standard benchmark to evaluate the performance of different methods, which hinders the progress of deep learning in this field. In this paper, we propose such a benchmark called PEER, a comprehensive and multi-task benchmark for Protein sEquence undERstanding. PEER provides a set of diverse protein understanding tasks including protein function prediction, protein localization prediction, protein structure prediction, protein-protein interaction prediction, and protein-ligand interaction prediction. We evaluate different types of sequence-based methods for each task including traditional feature engineering approaches, different sequence encoding methods as well as large-scale pre-trained protein language models. In addition, we also investigate the performance of these methods under the multi-task learning setting. Experimental results show that large-scale pre-trained protein language models achieve the best performance for most individual tasks, and jointly training multiple tasks further boosts the performance. The datasets and source codes of this benchmark will be open-sourced soon.
Author Information
Minghao Xu (Montreal Institute for Learning Algorithms, University of Montreal, University of Montreal)
I was born in Shanghai, a fast-developing metropolis in China. Currently, I am a first year PhD student at Mila - Quebec AI Institute, advised by Prof. Jian Tang, where I focus on molecular and protein representation learning for drug discovery, and structural image representation learning for visual understanding.
Zuobai Zhang (Montreal Institute for Learning Algorithms, University of Montreal, University of Montreal)
Jiarui Lu (Mila - Quebec AI Institute)
Zhaocheng Zhu (Mila - Quebec AI Institute)
Yangtian Zhang (Montreal Institute for Learning Algorithms, University of Montreal, Université de Montréal)
Ma Chang (University of Hong Kong)
I am a Ph.D. student at The University of Hong Kong, Department of Computer Science, co-advised by Dr.Lingpeng Kong and Dr.Tao Yu. My main research interest is in representation learning, along with interests in natural language processing and computational biology. I am passionate about developing new computational methods to study applicational problems, as well as improving the generalization ability of deep learning.
Runcheng Liu (Carnegie Mellon University)
Jian Tang (Mila)
More from the Same Authors
-
2021 : Multi-task Learning with Domain Knowledge for Molecular Property Prediction »
Shengchao Liu · Meng Qu · Zuobai Zhang · Jian Tang -
2022 Poster: Debiasing Graph Neural Networks via Learning Disentangled Causal Substructure »
Shaohua Fan · Xiao Wang · Yanhu Mo · Chuan Shi · Jian Tang -
2022 : MoleculeCLIP: Learning Transferable Molecule Multi-Modality Models via Natural Language »
Shengchao Liu · Weili Nie · Chengpeng Wang · Jiarui Lu · Zhuoran Qiao · Ling Liu · Jian Tang · Anima Anandkumar · Chaowei Xiao -
2022 : GraphCG: Unsupervised Discovery of Steerable Factors in Graphs »
Shengchao Liu · Chengpeng Wang · Weili Nie · Hanchen Wang · Jiarui Lu · Bolei Zhou · Jian Tang -
2023 Poster: GAUCHE: A Library for Gaussian Processes in Chemistry »
Ryan-Rhys Griffiths · Leo Klarner · Henry Moss · Aditya Ravuri · Sang Truong · Yuanqi Du · Samuel Stanton · Gary Tom · Bojana Rankovic · Arian Jamasb · Aryan Deshwal · Julius Schwartz · Austin Tripp · Gregory Kell · Simon Frieder · Anthony Bourached · Alex Chan · Jacob Moss · Chengzhi Guo · Johannes Peter Dürholt · Saudamini Chaurasia · Ji Won Park · Felix Strieth-Kalthoff · Alpha Lee · Bingqing Cheng · Alan Aspuru-Guzik · Philippe Schwaller · Jian Tang -
2023 Poster: Pre-Training Protein Encoder via Siamese Sequence-Structure Diffusion Trajectory Prediction »
Zuobai Zhang · Minghao Xu · Aurelie Lozano · Vijil Chenthamarakshan · Payel Das · Jian Tang -
2023 Poster: DiffPack: A Torsional Diffusion Model for Autoregressive Protein Side-Chain Packing »
Yangtian Zhang · Zuobai Zhang · Bozitao Zhong · Sanchit Misra · Jian Tang -
2023 Poster: A*Net: A Scalable Path-based Reasoning Approach for Knowledge Graphs »
Zhaocheng Zhu · Xinyu Yuan · Michael Galkin · Louis-Pascal Xhonneux · Ming Zhang · Maxime Gazeau · Jian Tang -
2023 Poster: GIMLET: A Unified Graph-Text Model for Instruction-Based Molecule Zero-Shot Learning »
Haiteng Zhao · Shengchao Liu · Ma Chang · Hannan Xu · Jie Fu · Zhihong Deng · Lingpeng Kong · Qi Liu -
2023 Poster: Symmetry-Informed Geometric Representation for Molecules, Proteins, and Crystalline Materials »
Shengchao Liu · weitao Du · Yanjing Li · Zhuoxinran Li · Zhiling Zheng · Chenru Duan · Zhi-Ming Ma · Omar Yaghi · Animashree Anandkumar · Christian Borgs · Jennifer Chayes · Hongyu Guo · Jian Tang -
2023 Poster: Evaluating Self-Supervised Learning for Molecular Graph Embeddings »
Hanchen Wang · Jean Kaddour · Shengchao Liu · Jian Tang · Joan Lasenby · Qi Liu -
2022 Workshop: Graph Learning for Industrial Applications: Finance, Crime Detection, Medicine and Social Media »
Manuela Veloso · John Dickerson · Senthil Kumar · Eren K. · Jian Tang · Jie Chen · Peter Henstock · Susan Tibbs · Ani Calinescu · Naftali Cohen · C. Bayan Bruss · Armineh Nourbakhsh -
2022 Spotlight: Debiasing Graph Neural Networks via Learning Disentangled Causal Substructure »
Shaohua Fan · Xiao Wang · Yanhu Mo · Chuan Shi · Jian Tang -
2022 Workshop: Temporal Graph Learning Workshop »
Reihaneh Rabbany · Jian Tang · Michael Bronstein · Shenyang Huang · Meng Qu · Kellin Pelrine · Jianan Zhao · Farimah Poursafaei · Aarash Feizi -
2022 Poster: Inductive Logical Query Answering in Knowledge Graphs »
Michael Galkin · Zhaocheng Zhu · Hongyu Ren · Jian Tang -
2022 Poster: High-Order Pooling for Graph Neural Networks with Tensor Decomposition »
Chenqing Hua · Guillaume Rabusseau · Jian Tang -
2021 Poster: Neural Bellman-Ford Networks: A General Graph Neural Network Framework for Link Prediction »
Zhaocheng Zhu · Zuobai Zhang · Louis-Pascal Xhonneux · Jian Tang