Timezone: »
Virtual screening, which identifies potential drugs from vast compound databases to bind with a particular protein pocket, is a critical step in AI-assisted drug discovery. Traditional docking methods are highly time-consuming, and can only work with a restricted search library in real-life applications. Recent supervised learning approaches using scoring functions for binding-affinity prediction, although promising, have not yet surpassed docking methods due to their strong dependency on limited data with reliable binding-affinity labels. In this paper, we propose a novel contrastive learning framework, DrugCLIP, by reformulating virtual screening as a dense retrieval task and employing contrastive learning to align representations of binding protein pockets and molecules from a large quantity of pairwise data without explicit binding-affinity scores. We also introduce a biological-knowledge inspired data augmentation strategy to learn better protein-molecule representations. Extensive experiments show that DrugCLIP significantly outperforms traditional docking and supervised learning methods on diverse virtual screening benchmarks with highly reduced computation time, especially in zero-shot setting.
Author Information
Bowen Gao (Tsinghua University, Tsinghua University)
Bo Qiang (Peking University)
Haichuan Tan (Institute for AI Industry Research (AIR), Tsinghua University)
Yinjun Jia (Tsinghua University)
Minsi Ren (Casia)
Minsi Lu (Tsinghua University, Tsinghua University)
Jingjing Liu (Microsoft)
Wei-Ying Ma (Tsinghua University)
Wei-Ying Ma is Huiyan Chair Professor and Chief Scientist at Institute for AI Industry Research, Tsinghua University. Prior to Tsinghua University, he was a Vice President and the Head of AI Lab at ByteDance, responsible for fundamental research and technology development in areas including machine learning, computer vision, speech and audio processing, natural language processing, personalized recommendation and search. The technologies from his teams have been used by hundreds of millions of users daily through ByteDance's content platforms and core products such as Tiktok, Douyin, and Jinri Toutiao. Prior to ByteDance, Wei-Ying Ma was an Assistant Managing Director at Microsoft Research Asia. He developed many technologies that were successfully transferred to Microsoft’s Bing, Ads Center, Cognitive Services, Exchange, SharePoint, Azure, and Xiaoice. He has contributed to many open-sourced technologies at GitHub, including the Distributed Machine Learning Toolkit, Microsoft Graph Engine and Microsoft Concept Graph. Wei-Ying Ma has published more than 300 papers at international conferences and in journals. He has been granted with more than 160 patents. He is a Fellow of the IEEE. He has served as program co-chair of the International Conference on World Wide Web (WWW) 2008 and the general co-chair of ACM Special Interest Group on Information Retrieval (SIGIR) 2011. He has served on many editorial boards including ACM Transactions on Information System (TOIS), the ACM/Springer Multimedia Systems Journal, and the Journal of Multimedia Tools and Applications. He was a member of International World Wide Web Conferences Steering Committee from 2010 to 2016.
Yanyan Lan (Tsinghua University, Tsinghua University)
More from the Same Authors
-
2021 : VALUE: A Multi-Task Benchmark for Video-and-Language Understanding Evaluation »
Linjie Li · Jie Lei · Zhe Gan · Licheng Yu · Yen-Chun Chen · Rohit Pillai · Yu Cheng · Luowei Zhou · Xin Wang · William Yang Wang · Tamara L Berg · Mohit Bansal · Jingjing Liu · Lijuan Wang · Zicheng Liu -
2021 : Multi-modal Self-supervised Pre-training for Large-scale Genome Data »
Shentong Mo · Xi Fu · Chenyang Hong · Yizhen Chen · Yuxuan Zheng · Xiangru Tang · Yanyan Lan · Zhiqiang Shen · Eric Xing -
2022 : Distance-Sensitive Offline Reinforcement Learning »
Haoran Xu · Xianyuan Zhan · Haoran Xu · Xiangyu Zhu · Jingjing Liu · Ya-Qin Zhang -
2023 : Delta Score: Improving the Binding Assessment of Structure-Based Drug Design Methods »
Minsi Ren · Bowen Gao · Bo Qiang · Yanyan Lan -
2023 Poster: Equivariant Flow Matching with Hybrid Probability Transport for 3D Molecule Generation »
Yuxuan Song · Jingjing Gong · Minkai Xu · Ziyao Cao · Yanyan Lan · Stefano Ermon · Hao Zhou · Wei-Ying Ma -
2023 Poster: Idempotent Learned Image Compression with Right-Inverse »
Yanghao Li · Tongda Xu · Yan Wang · Jingjing Liu · Ya-Qin Zhang -
2022 Spotlight: When Does Group Invariant Learning Survive Spurious Correlations? »
Yimeng Chen · Ruibin Xiong · Zhi-Ming Ma · Yanyan Lan -
2022 Poster: When Does Group Invariant Learning Survive Spurious Correlations? »
Yimeng Chen · Ruibin Xiong · Zhi-Ming Ma · Yanyan Lan -
2021 Poster: Uncertainty Calibration for Ensemble-Based Debiasing Methods »
Ruibin Xiong · Yimeng Chen · Liang Pang · Xueqi Cheng · Zhi-Ming Ma · Yanyan Lan -
2021 Poster: Data-Efficient GAN Training Beyond (Just) Augmentations: A Lottery Ticket Perspective »
Tianlong Chen · Yu Cheng · Zhe Gan · Jingjing Liu · Zhangyang Wang -
2021 Poster: The Elastic Lottery Ticket Hypothesis »
Xiaohan Chen · Yu Cheng · Shuohang Wang · Zhe Gan · Jingjing Liu · Zhangyang Wang