Timezone: »
Therapeutics machine learning is an emerging field with incredible opportunities for innovation and impact. However, advancement in this field requires the formulation of meaningful tasks and careful curation of datasets. Here, we introduce Therapeutics Data Commons (TDC), the first unifying platform to systematically access and evaluate machine learning across the entire range of therapeutics. To date, TDC includes 66 AI-ready datasets spread across 22 learning tasks and spanning the discovery and development of safe and effective medicines. TDC also provides an ecosystem of tools and community resources, including 33 data functions and diverse types of data splits, 23 strategies for systematic model evaluation, 17 molecule generation oracles, and 29 public leaderboards. All resources are integrated and accessible via an open Python library. We carry out extensive experiments on selected datasets, demonstrating that even the strongest algorithms fall short of solving key therapeutics challenges, including distributional shifts, multi-scale and multi-modal learning, and robust generalization to novel data points. We envision that TDC can facilitate algorithmic advances and considerably accelerate machine-learning model development, validation and transition into biomedical and clinical implementation. TDC is available at https://tdcommons.ai.
Author Information
Kexin Huang (Stanford University)
Tianfan Fu (Georgia Institute of Technology)
Wenhao Gao (Massachusetts Institute of Technology)
Yue Zhao (Carnegie Mellon University)
I am pursuing a Ph.D. in Information Systems at Carnegie Mellon University, advised by Prof. Leman Akoglu. Different from most IS researchers, I focus on data mining algorithms, systems, and applications. Research Keywords: Outlier & Anomaly Detection; Ensemble Learning; Scalable Machine Learning; Machine Learning Systems.
Yusuf Roohani (Stanford University)
Jure Leskovec (Stanford University/Pinterest)
Connor Coley (MIT)
Cao Xiao (Iqvia)
Jimeng Sun (University of Illinois, Urbana Champaign)
Marinka Zitnik (Harvard University)
More from the Same Authors
-
2021 : Revisiting Time Series Outlier Detection: Definitions and Benchmarks »
Kwei-Herng Lai · Daochen Zha · Junjie Xu · Yue Zhao · Guanchu Wang · Xia Hu -
2021 Spotlight: GeoMol: Torsional Geometric Generation of Molecular 3D Conformer Ensembles »
Octavian Ganea · Lagnajit Pattanaik · Connor Coley · Regina Barzilay · Klavs Jensen · William Green · Tommi Jaakkola -
2021 Spotlight: Combiner: Full Attention Transformer with Sparse Computation Cost »
Hongyu Ren · Hanjun Dai · Zihang Dai · Mengjiao (Sherry) Yang · Jure Leskovec · Dale Schuurmans · Bo Dai -
2021 : OGB-LSC: A Large-Scale Challenge for Machine Learning on Graphs »
Weihua Hu · Matthias Fey · Hongyu Ren · Maho Nakata · Yuxiao Dong · Jure Leskovec -
2021 : Extending the WILDS Benchmark for Unsupervised Adaptation »
Shiori Sagawa · Pang Wei Koh · Tony Lee · Irena Gao · Sang Michael Xie · Kendrick Shen · Ananya Kumar · Weihua Hu · Michihiro Yasunaga · Henrik Marklund · Sara Beery · Ian Stavness · Jure Leskovec · Kate Saenko · Tatsunori Hashimoto · Sergey Levine · Chelsea Finn · Percy Liang -
2021 : Bringing Atomistic Deep Learning to Prime Time »
Nathan Frey · Siddharth Samsi · Bharath Ramsundar · Connor Coley -
2021 : Scalable Geometric Deep Learning on Molecular Graphs »
Nathan Frey · Siddharth Samsi · Lin Li · Connor Coley -
2021 : Adaptive Pseudo-labeling for Quantum Calculations »
Kexin Huang · Vishnu Sresht · Brajesh Rai -
2022 : De novo PROTAC design using graph-based deep generative models »
Divya Nori · Connor Coley · Rocío Mercado -
2022 : Tabular deep learning when $d \gg n$ by using an auxiliary knowledge graph »
Camilo Ruiz · Hongyu Ren · Kexin Huang · Jure Leskovec -
2022 : Learning Controllable Adaptive Simulation for Multi-scale Physics »
Tailin Wu · Takashi Maruyama · Qingqing Zhao · Gordon Wetzstein · Jure Leskovec -
2022 : Learning Efficient Hybrid Particle-continuum Representations of Non-equilibrium N-body Systems »
Tailin Wu · Michael Sun · Hsuan-Gu Chou · Pranay Reddy Samala · Sithipont Cholsaipant · Sophia Kivelson · Jacqueline Yau · Rex Ying · E. Paulo Alves · Jure Leskovec · Frederico Fiuza -
2022 : De novo PROTAC design using graph-based deep generative models »
Divya Nori · Connor Coley · Rocío Mercado -
2022 : AutoTransfer: AutoML with Knowledge Transfer - An Application to Graph Neural Networks »
Kaidi Cao · Jiaxuan You · Jiaju Liu · Jure Leskovec -
2022 : Efficient Automatic Machine Learning via Design Graphs »
Shirley Wu · Jiaxuan You · Jure Leskovec · Rex Ying -
2022 : Recommendation for New Drugs with Limited Prescription Data »
Zhenbang Wu · Huaxiu Yao · Zhe Su · David Liebovitz · Lucas Glass · James Zou · Chelsea Finn · Jimeng Sun -
2023 Poster: Uncertainty Quantification over Graph with Conformalized Graph Neural Networks »
Kexin Huang · Ying Jin · Emmanuel Candes · Jure Leskovec -
2023 Poster: PRODIGY: Enabling In-context Learning Over Graphs »
Qian Huang · Hongyu Ren · Peng Chen · Gregor Kržmanc · Daniel Zeng · Percy Liang · Jure Leskovec -
2023 Poster: When Do Graph Neural Networks Help with Node Classification: Investigating the Homophily Principle on Node Distinguishability »
Sitao Luan · Chenqing Hua · Minkai Xu · Qincheng Lu · Jiaqi Zhu · Xiao-Wen Chang · Jie Fu · Jure Leskovec · Doina Precup -
2023 Poster: BIOT: Biosignal Transformer for Cross-data Learning in the Wild »
Chaoqi Yang · M Westover · Jimeng Sun -
2023 Poster: An Iterative Self-Learning Framework for Medical Domain Generalization »
Zhenbang Wu · Huaxiu Yao · David Liebovitz · Jimeng Sun -
2023 Poster: Prefix-tree decoding for predicting mass spectra from molecules »
Samuel Goldman · John Bradshaw · Jiayi Xin · Connor Coley -
2023 Poster: CoDrug: Conformal Drug Property Prediction with Density Estimation under Covariate Shift »
Siddhartha Laghuvarapu · Zhen Lin · Jimeng Sun -
2023 Poster: Enabling tabular deep learning when $d \gg n$ with an auxiliary knowledge graph »
Camilo Ruiz · Hongyu Ren · Kexin Huang · Jure Leskovec -
2023 Poster: Zero-shot causal learning »
Hamed Nilforoshan · Michael Moor · Yusuf Roohani · Yining Chen · Anja Šurina · Michihiro Yasunaga · Sara Oblak · Jure Leskovec -
2023 Poster: Learning Large Graph Property Prediction via Graph Segment Training »
Kaidi Cao · Phitchaya Phothilimtha · Sami Abu-El-Haija · Dustin Zelle · Yanqi Zhou · Charith Mendis · Jure Leskovec · Bryan Perozzi -
2023 Poster: Holistic Evaluation of Text-to-Image Models »
Tony Lee · Michihiro Yasunaga · Chenlin Meng · Yifan Mai · Joon Sung Park · Agrim Gupta · Yunzhi Zhang · Deepak Narayanan · Hannah Teufel · Marco Bellagente · Minguk Kang · Taesung Park · Jure Leskovec · Jun-Yan Zhu · Fei-Fei Li · Jiajun Wu · Stefano Ermon · Percy Liang -
2023 Poster: Temporal Graph Benchmark for Machine Learning on Temporal Graphs »
Shenyang Huang · Farimah Poursafaei · Jacob Danovitch · Matthias Fey · Weihua Hu · Emanuele Rossi · Jure Leskovec · Michael Bronstein · Guillaume Rabusseau · Reihaneh Rabbany -
2023 Workshop: Generative AI and Biology (GenBio@NeurIPS2023) »
Minkai Xu · Regina Barzilay · Jure Leskovec · Wenxian Shi · Menghua Wu · Zhenqiao Song · Lei Li · Fan Yang · Stefano Ermon -
2023 Workshop: AI for Science: from Theory to Practice »
Yuanqi Du · Max Welling · Yoshua Bengio · Marinka Zitnik · Carla Gomes · Jure Leskovec · Maria Brbic · Wenhao Gao · Kexin Huang · Ziming Liu · Rocío Mercado · Miles Cranmer · Shengchao Liu · Lijing Wang -
2023 Workshop: New Frontiers of AI for Drug Discovery and Development »
Animashree Anandkumar · Ilija Bogunovic · Ti-chiun Chang · Quanquan Gu · Jure Leskovec · Michelle Li · Chong Liu · Nataša Tagasovska · Wei Wang -
2022 Competition: OGB-LSC 2022: A Large-Scale Challenge for ML on Graphs »
Weihua Hu · Matthias Fey · Hongyu Ren · Maho Nakata · Yuxiao Dong · Jure Leskovec -
2022 : Introduction to OGB-LSC »
Jure Leskovec -
2022 : A High-Throughput Platform for Efficient Exploration of Polypeptides Chemical Space via Automation and Machine Learning »
Guangqi Wu · Connor Coley · Hua Lu -
2022 : Automated Materials Synthesis Keynote »
Connor Coley -
2022 : MolPAL: Software for Sample Efficient High-Throughput Virtual Screening »
David Graff · Connor Coley -
2022 : A source data privacy framework for synthetic clinical trial data »
Afrah Shafquat · Jason Mezey · Mandis Beigi · Jimeng Sun · Jacob Aptekar -
2022 Workshop: AI for Science: Progress and Promises »
Yi Ding · Yuanqi Du · Tianfan Fu · Hanchen Wang · Anima Anandkumar · Yoshua Bengio · Anthony Gitter · Carla Gomes · Aviv Regev · Max Welling · Marinka Zitnik -
2022 Poster: ADBench: Anomaly Detection Benchmark »
Songqiao Han · Xiyang Hu · Hailiang Huang · Minqi Jiang · Yue Zhao -
2022 Poster: BOND: Benchmarking Unsupervised Outlier Node Detection on Static Attributed Graphs »
Kay Liu · Yingtong Dou · Yue Zhao · Xueying Ding · Xiyang Hu · Ruitong Zhang · Kaize Ding · Canyu Chen · Hao Peng · Kai Shu · Lichao Sun · Jundong Li · George H Chen · Zhihao Jia · Philip S Yu -
2022 Poster: Deep Bidirectional Language-Knowledge Graph Pretraining »
Michihiro Yasunaga · Antoine Bosselut · Hongyu Ren · Xikun Zhang · Christopher D Manning · Percy Liang · Jure Leskovec -
2022 Poster: Reinforced Genetic Algorithm for Structure-based Drug Design »
Tianfan Fu · Wenhao Gao · Connor Coley · Jimeng Sun -
2022 Poster: ATD: Augmenting CP Tensor Decomposition by Self Supervision »
Chaoqi Yang · Cheng Qian · Navjot Singh · Cao (Danica) Xiao · M Westover · Edgar Solomonik · Jimeng Sun -
2022 Poster: ZeroC: A Neuro-Symbolic Model for Zero-shot Concept Recognition and Acquisition at Inference Time »
Tailin Wu · Megan Tjandrasuwita · Zhengxuan Wu · Xuelin Yang · Kevin Liu · Rok Sosic · Jure Leskovec -
2022 Poster: Learning to Accelerate Partial Differential Equations via Latent Global Evolution »
Tailin Wu · Takashi Maruyama · Jure Leskovec -
2022 Poster: Few-shot Relational Reasoning via Connection Subgraph Pretraining »
Qian Huang · Hongyu Ren · Jure Leskovec -
2022 Poster: TransTab: Learning Transferable Tabular Transformers Across Tables »
Zifeng Wang · Jimeng Sun -
2022 Poster: Sample Efficiency Matters: A Benchmark for Practical Molecular Optimization »
Wenhao Gao · Tianfan Fu · Jimeng Sun · Connor Coley -
2022 Poster: Graphein - a Python Library for Geometric Deep Learning and Network Analysis on Biomolecular Structures and Interaction Networks »
Arian Jamasb · Ramon Viñas Torné · Eric Ma · Yuanqi Du · Charles Harris · Kexin Huang · Dominic Hall · Pietro Lió · Tom Blundell -
2022 Poster: Conformal Prediction with Temporal Quantile Adjustments »
Zhen Lin · Shubhendu Trivedi · Jimeng Sun -
2021 : AI X Chemistry »
Connor Coley -
2021 Workshop: AI for Science: Mind the Gaps »
Payal Chandak · Yuanqi Du · Tianfan Fu · Wenhao Gao · Kexin Huang · Shengchao Liu · Ziming Liu · Gabriel Spadon · Max Tegmark · Hanchen Wang · Adrian Weller · Max Welling · Marinka Zitnik -
2021 Poster: Learning Graph Models for Retrosynthesis Prediction »
Vignesh Ram Somnath · Charlotte Bunne · Connor Coley · Andreas Krause · Regina Barzilay -
2021 Poster: Combiner: Full Attention Transformer with Sparse Computation Cost »
Hongyu Ren · Hanjun Dai · Zihang Dai · Mengjiao (Sherry) Yang · Jure Leskovec · Dale Schuurmans · Bo Dai -
2021 Poster: Modeling Heterogeneous Hierarchies with Relation-specific Hyperbolic Cones »
Yushi Bai · Zhitao Ying · Hongyu Ren · Jure Leskovec -
2021 Poster: Neural Distance Embeddings for Biological Sequences »
Gabriele Corso · Zhitao Ying · Michal Pándy · Petar Veličković · Jure Leskovec · Pietro Liò -
2021 Poster: GeoMol: Torsional Geometric Generation of Molecular 3D Conformer Ensembles »
Octavian Ganea · Lagnajit Pattanaik · Connor Coley · Regina Barzilay · Klavs Jensen · William Green · Tommi Jaakkola -
2021 Poster: Automatic Unsupervised Outlier Model Selection »
Yue Zhao · Ryan Rossi · Leman Akoglu -
2020 : Q&A #2 »
Heng Ji · Jure Leskovec · Jiajun Wu -
2020 : Invited Talk #4 »
Jure Leskovec -
2020 Poster: Open Graph Benchmark: Datasets for Machine Learning on Graphs »
Weihua Hu · Matthias Fey · Marinka Zitnik · Yuxiao Dong · Hongyu Ren · Bowen Liu · Michele Catasta · Jure Leskovec -
2020 Poster: Coresets for Robust Training of Deep Neural Networks against Noisy Labels »
Baharan Mirzasoleiman · Kaidi Cao · Jure Leskovec -
2020 Poster: Graph Information Bottleneck »
Tailin Wu · Hongyu Ren · Pan Li · Jure Leskovec -
2020 Spotlight: Open Graph Benchmark: Datasets for Machine Learning on Graphs »
Weihua Hu · Matthias Fey · Marinka Zitnik · Yuxiao Dong · Hongyu Ren · Bowen Liu · Michele Catasta · Jure Leskovec -
2020 Poster: Distance Encoding: Design Provably More Powerful Neural Networks for Graph Representation Learning »
Pan Li · Yanbang Wang · Hongwei Wang · Jure Leskovec -
2020 Poster: Handling Missing Data with Graph Representation Learning »
Jiaxuan You · Xiaobai Ma · Yi Ding · Mykel J Kochenderfer · Jure Leskovec -
2020 Demonstration: MolDesigner: Interactive Design of Efficacious Drugs with Deep Learning »
Kexin Huang · Tianfan Fu · Dawood Khan · Ali Abid · Ali Abdalla · Abubaker Abid · Lucas Glass · Marinka Zitnik · Cao Xiao · Jimeng Sun -
2020 Poster: Design Space for Graph Neural Networks »
Jiaxuan You · Zhitao Ying · Jure Leskovec -
2020 Poster: Beta Embeddings for Multi-Hop Logical Reasoning in Knowledge Graphs »
Hongyu Ren · Jure Leskovec -
2020 Spotlight: Design Space for Graph Neural Networks »
Jiaxuan You · Zhitao Ying · Jure Leskovec -
2019 : Marinka Zitnik: Graph Neural Networks for Drug Discovery and Development »
Marinka Zitnik -
2019 : Presentation and Discussion: Open Graph Benchmark »
Jure Leskovec -
2019 Workshop: Graph Representation Learning »
Will Hamilton · Rianne van den Berg · Michael Bronstein · Stefanie Jegelka · Thomas Kipf · Jure Leskovec · Renjie Liao · Yizhou Sun · Petar Veličković -
2019 Poster: Hyperbolic Graph Convolutional Neural Networks »
Ines Chami · Zhitao Ying · Christopher Ré · Jure Leskovec -
2019 Poster: G2SAT: Learning to Generate SAT Formulas »
Jiaxuan You · Haoze Wu · Clark Barrett · Raghuram Ramanujan · Jure Leskovec -
2019 Poster: Retrosynthesis Prediction with Conditional Graph Logic Network »
Hanjun Dai · Chengtao Li · Connor Coley · Bo Dai · Le Song -
2019 Poster: GNNExplainer: Generating Explanations for Graph Neural Networks »
Zhitao Ying · Dylan Bourgeois · Jiaxuan You · Marinka Zitnik · Jure Leskovec -
2018 : Panel »
Paroma Varma · Aditya Grover · Will Hamilton · Jessica Hamrick · Thomas Kipf · Marinka Zitnik -
2018 : Poster Session I »
Aniruddh Raghu · Daniel Jarrett · Kathleen Lewis · Elias Chaibub Neto · Nicholas Mastronarde · Shazia Akbar · Chun-Hung Chao · Henghui Zhu · Seth Stafford · Luna Zhang · Jen-Tang Lu · Changhee Lee · Adityanarayanan Radhakrishnan · Fabian Falck · Liyue Shen · Daniel Neil · Yusuf Roohani · Aparna Balagopalan · Brett Marinelli · Hagai Rossman · Sven Giesselbach · Jose Javier Gonzalez Ortiz · Edward De Brouwer · Byung-Hoon Kim · Rafid Mahmood · Tzu Ming Hsu · Antonio Ribeiro · Rumi Chunara · Agni Orfanoudaki · Kristen Severson · Mingjie Mai · Sonali Parbhoo · Albert Haque · Viraj Prabhu · Di Jin · Alena Harley · Geoffroy Dubourg-Felonneau · Xiaodan Hu · Maithra Raghu · Jonathan Warrell · Nelson Johansen · Wenyuan Li · Marko Järvenpää · Satya Narayan Shukla · Sarah Tan · Vincent Fortuin · Beau Norgeot · Yi-Te Hsu · Joel H Saltz · Veronica Tozzo · Andrew Miller · Guillaume Ausset · Azin Asgarian · Francesco Paolo Casale · Antoine Neuraz · Bhanu Pratap Singh Rawat · Turgay Ayer · Xinyu Li · Mehul Motani · Nathaniel Braman · Laetitia M Shao · Adrian Dalca · Hyunkwang Lee · Emma Pierson · Sandesh Ghimire · Yuji Kawai · Owen Lahav · Anna Goldenberg · Denny Wu · Pavitra Krishnaswamy · Colin Pawlowski · Arijit Ukil · Yuhui Zhang -
2018 Poster: Graph Convolutional Policy Network for Goal-Directed Molecular Graph Generation »
Jiaxuan You · Bowen Liu · Zhitao Ying · Vijay Pande · Jure Leskovec -
2018 Poster: Dynamic Network Model from Partial Observations »
Elahe Ghalebi · Baharan Mirzasoleiman · Radu Grosu · Jure Leskovec -
2018 Spotlight: Graph Convolutional Policy Network for Goal-Directed Molecular Graph Generation »
Jiaxuan You · Bowen Liu · Zhitao Ying · Vijay Pande · Jure Leskovec -
2018 Spotlight: Dynamic Network Model from Partial Observations »
Elahe Ghalebi · Baharan Mirzasoleiman · Radu Grosu · Jure Leskovec -
2018 Poster: Hierarchical Graph Representation Learning with Differentiable Pooling »
Zhitao Ying · Jiaxuan You · Christopher Morris · Xiang Ren · Will Hamilton · Jure Leskovec -
2018 Spotlight: Hierarchical Graph Representation Learning with Differentiable Pooling »
Zhitao Ying · Jiaxuan You · Christopher Morris · Xiang Ren · Will Hamilton · Jure Leskovec -
2018 Poster: Embedding Logical Queries on Knowledge Graphs »
Will Hamilton · Payal Bajaj · Marinka Zitnik · Dan Jurafsky · Jure Leskovec -
2017 : Jure Leskovec, Stanford »
Jure Leskovec -
2017 Poster: Predicting Organic Reaction Outcomes with Weisfeiler-Lehman Network »
Wengong Jin · Connor Coley · Regina Barzilay · Tommi Jaakkola -
2017 Poster: Inductive Representation Learning on Large Graphs »
Will Hamilton · Zhitao Ying · Jure Leskovec -
2016 Poster: Confusions over Time: An Interpretable Bayesian Model to Characterize Trends in Decision Making »
Himabindu Lakkaraju · Jure Leskovec -
2013 Workshop: Frontiers of Network Analysis: Methods, Models, and Applications »
Edo M Airoldi · David S Choi · Aaron Clauset · Khalid El-Arini · Jure Leskovec -
2013 Poster: Nonparametric Multi-group Membership Model for Dynamic Networks »
Myunghwan Kim · Jure Leskovec -
2012 Workshop: Social network and social media analysis: Methods, models and applications »
Edo M Airoldi · David S Choi · Khalid El-Arini · Jure Leskovec -
2012 Poster: Learning to Discover Social Circles in Ego Networks »
Julian J McAuley · Jure Leskovec -
2010 Workshop: Networks Across Disciplines: Theory and Applications »
Edo M Airoldi · Anna Goldenberg · Jure Leskovec · Quaid Morris -
2010 Oral: On the Convexity of Latent Social Network Inference »
Seth A Myers · Jure Leskovec -
2010 Poster: On the Convexity of Latent Social Network Inference »
Seth A Myers · Jure Leskovec -
2009 Workshop: Analyzing Networks and Learning With Graphs »
Edo M Airoldi · Jure Leskovec · Jon Kleinberg · Josh Tenenbaum