Timezone: »

DrugCLIP: Contrasive Protein-Molecule Representation Learning for Virtual Screening
Bowen Gao · Bo Qiang · Haichuan Tan · Yinjun Jia · Minsi Ren · Minsi Lu · Jingjing Liu · Wei-Ying Ma · Yanyan Lan

Thu Dec 14 08:45 AM -- 10:45 AM (PST) @ Great Hall & Hall B1+B2 #103

Virtual screening, which identifies potential drugs from vast compound databases to bind with a particular protein pocket, is a critical step in AI-assisted drug discovery. Traditional docking methods are highly time-consuming, and can only work with a restricted search library in real-life applications. Recent supervised learning approaches using scoring functions for binding-affinity prediction, although promising, have not yet surpassed docking methods due to their strong dependency on limited data with reliable binding-affinity labels. In this paper, we propose a novel contrastive learning framework, DrugCLIP, by reformulating virtual screening as a dense retrieval task and employing contrastive learning to align representations of binding protein pockets and molecules from a large quantity of pairwise data without explicit binding-affinity scores. We also introduce a biological-knowledge inspired data augmentation strategy to learn better protein-molecule representations. Extensive experiments show that DrugCLIP significantly outperforms traditional docking and supervised learning methods on diverse virtual screening benchmarks with highly reduced computation time, especially in zero-shot setting.

Author Information

Bowen Gao (Tsinghua University, Tsinghua University)
Bo Qiang (Peking University)
Haichuan Tan (Institute for AI Industry Research (AIR), Tsinghua University)
Yinjun Jia (Tsinghua University)
Minsi Ren (Casia)
Minsi Lu (Tsinghua University, Tsinghua University)
Jingjing Liu (Microsoft)
Wei-Ying Ma (Tsinghua University)

Wei-Ying Ma is Huiyan Chair Professor and Chief Scientist at Institute for AI Industry Research, Tsinghua University. Prior to Tsinghua University, he was a Vice President and the Head of AI Lab at ByteDance, responsible for fundamental research and technology development in areas including machine learning, computer vision, speech and audio processing, natural language processing, personalized recommendation and search. The technologies from his teams have been used by hundreds of millions of users daily through ByteDance's content platforms and core products such as Tiktok, Douyin, and Jinri Toutiao. Prior to ByteDance, Wei-Ying Ma was an Assistant Managing Director at Microsoft Research Asia. He developed many technologies that were successfully transferred to Microsoft’s Bing, Ads Center, Cognitive Services, Exchange, SharePoint, Azure, and Xiaoice. He has contributed to many open-sourced technologies at GitHub, including the Distributed Machine Learning Toolkit, Microsoft Graph Engine and Microsoft Concept Graph. Wei-Ying Ma has published more than 300 papers at international conferences and in journals. He has been granted with more than 160 patents. He is a Fellow of the IEEE. He has served as program co-chair of the International Conference on World Wide Web (WWW) 2008 and the general co-chair of ACM Special Interest Group on Information Retrieval (SIGIR) 2011. He has served on many editorial boards including ACM Transactions on Information System (TOIS), the ACM/Springer Multimedia Systems Journal, and the Journal of Multimedia Tools and Applications. He was a member of International World Wide Web Conferences Steering Committee from 2010 to 2016.

Yanyan Lan (Tsinghua University, Tsinghua University)

More from the Same Authors