Timezone: »
Therapeutics machine learning is an emerging field with incredible opportunities for innovation and impact. However, advancement in this field requires the formulation of meaningful tasks and careful curation of datasets. Here, we introduce Therapeutics Data Commons (TDC), the first unifying platform to systematically access and evaluate machine learning across the entire range of therapeutics. To date, TDC includes 66 AI-ready datasets spread across 22 learning tasks and spanning the discovery and development of safe and effective medicines. TDC also provides an ecosystem of tools and community resources, including 33 data functions and diverse types of data splits, 23 strategies for systematic model evaluation, 17 molecule generation oracles, and 29 public leaderboards. All resources are integrated and accessible via an open Python library. We carry out extensive experiments on selected datasets, demonstrating that even the strongest algorithms fall short of solving key therapeutics challenges, including distributional shifts, multi-scale and multi-modal learning, and robust generalization to novel data points. We envision that TDC can facilitate algorithmic advances and considerably accelerate machine-learning model development, validation and transition into biomedical and clinical implementation. TDC is available at https://tdcommons.ai.
Author Information
Kexin Huang (Stanford University)
Tianfan Fu (Georgia Institute of Technology)
Wenhao Gao (Massachusetts Institute of Technology)
Yue Zhao (Carnegie Mellon University)
I am pursuing a Ph.D. in Information Systems at Carnegie Mellon University, advised by Prof. Leman Akoglu. Different from most IS researchers, I focus on data mining algorithms, systems, and applications. Research Keywords: Outlier & Anomaly Detection; Ensemble Learning; Scalable Machine Learning; Machine Learning Systems.
Yusuf Roohani (Stanford University)
Jure Leskovec (Stanford University/Pinterest)
Connor Coley (MIT)
Cao Xiao (Iqvia)
Jimeng Sun (University of Illinois, Urbana Champaign)
Marinka Zitnik (Harvard University)
More from the Same Authors
-
2021 : Revisiting Time Series Outlier Detection: Definitions and Benchmarks »
Kwei-Herng Lai · Daochen Zha · Junjie Xu · Yue Zhao · Guanchu Wang · Xia Hu -
2021 Spotlight: GeoMol: Torsional Geometric Generation of Molecular 3D Conformer Ensembles »
Octavian Ganea · Lagnajit Pattanaik · Connor Coley · Regina Barzilay · Klavs Jensen · William Green · Tommi Jaakkola -
2021 Spotlight: Combiner: Full Attention Transformer with Sparse Computation Cost »
Hongyu Ren · Hanjun Dai · Zihang Dai · Mengjiao Yang · Jure Leskovec · Dale Schuurmans · Bo Dai -
2021 : OGB-LSC: A Large-Scale Challenge for Machine Learning on Graphs »
Weihua Hu · Matthias Fey · Hongyu Ren · Maho Nakata · Yuxiao Dong · Jure Leskovec -
2021 : Extending the WILDS Benchmark for Unsupervised Adaptation »
Shiori Sagawa · Pang Wei Koh · Tony Lee · Irena Gao · Sang Michael Xie · Kendrick Shen · Ananya Kumar · Weihua Hu · Michihiro Yasunaga · Henrik Marklund · Sara Beery · Ian Stavness · Jure Leskovec · Kate Saenko · Tatsunori Hashimoto · Sergey Levine · Chelsea Finn · Percy Liang -
2021 : Bringing Atomistic Deep Learning to Prime Time »
Nathan Frey · Siddharth Samsi · Bharath Ramsundar · Connor Coley -
2021 : Scalable Geometric Deep Learning on Molecular Graphs »
Nathan Frey · Siddharth Samsi · Lin Li · Connor Coley -
2021 : Adaptive Pseudo-labeling for Quantum Calculations »
Kexin Huang · Vishnu Sresht · Brajesh Rai -
2021 : AI X Chemistry »
Connor Coley -
2021 Workshop: AI for Science: Mind the Gaps »
Payal Chandak · Yuanqi Du · Tianfan Fu · Wenhao Gao · Kexin Huang · Shengchao Liu · Ziming Liu · Gabriel Spadon · Max Tegmark · Hanchen Wang · Adrian Weller · Max Welling · Marinka Zitnik -
2021 Poster: Learning Graph Models for Retrosynthesis Prediction »
Vignesh Ram Somnath · Charlotte Bunne · Connor Coley · Andreas Krause · Regina Barzilay -
2021 Poster: Combiner: Full Attention Transformer with Sparse Computation Cost »
Hongyu Ren · Hanjun Dai · Zihang Dai · Mengjiao Yang · Jure Leskovec · Dale Schuurmans · Bo Dai -
2021 Poster: Modeling Heterogeneous Hierarchies with Relation-specific Hyperbolic Cones »
Yushi Bai · Zhitao Ying · Hongyu Ren · Jure Leskovec -
2021 Poster: Neural Distance Embeddings for Biological Sequences »
Gabriele Corso · Zhitao Ying · Michal Pándy · Petar Veličković · Jure Leskovec · Pietro Liò -
2021 Poster: GeoMol: Torsional Geometric Generation of Molecular 3D Conformer Ensembles »
Octavian Ganea · Lagnajit Pattanaik · Connor Coley · Regina Barzilay · Klavs Jensen · William Green · Tommi Jaakkola -
2021 Poster: Automatic Unsupervised Outlier Model Selection »
Yue Zhao · Ryan Rossi · Leman Akoglu -
2020 : Q&A #2 »
Heng Ji · Jure Leskovec · Jiajun Wu -
2020 : Invited Talk #4 »
Jure Leskovec -
2020 Poster: Open Graph Benchmark: Datasets for Machine Learning on Graphs »
Weihua Hu · Matthias Fey · Marinka Zitnik · Yuxiao Dong · Hongyu Ren · Bowen Liu · Michele Catasta · Jure Leskovec -
2020 Poster: Coresets for Robust Training of Deep Neural Networks against Noisy Labels »
Baharan Mirzasoleiman · Kaidi Cao · Jure Leskovec -
2020 Poster: Graph Information Bottleneck »
Tailin Wu · Hongyu Ren · Pan Li · Jure Leskovec -
2020 Spotlight: Open Graph Benchmark: Datasets for Machine Learning on Graphs »
Weihua Hu · Matthias Fey · Marinka Zitnik · Yuxiao Dong · Hongyu Ren · Bowen Liu · Michele Catasta · Jure Leskovec -
2020 Poster: Distance Encoding: Design Provably More Powerful Neural Networks for Graph Representation Learning »
Pan Li · Yanbang Wang · Hongwei Wang · Jure Leskovec -
2020 Poster: Handling Missing Data with Graph Representation Learning »
Jiaxuan You · Xiaobai Ma · Yi Ding · Mykel J Kochenderfer · Jure Leskovec -
2020 Demonstration: MolDesigner: Interactive Design of Efficacious Drugs with Deep Learning »
Kexin Huang · Tianfan Fu · Dawood Khan · Ali Abid · Ali Abdalla · Abubaker Abid · Lucas Glass · Marinka Zitnik · Cao Xiao · Jimeng Sun -
2020 Poster: Design Space for Graph Neural Networks »
Jiaxuan You · Zhitao Ying · Jure Leskovec -
2020 Poster: Beta Embeddings for Multi-Hop Logical Reasoning in Knowledge Graphs »
Hongyu Ren · Jure Leskovec -
2020 Spotlight: Design Space for Graph Neural Networks »
Jiaxuan You · Zhitao Ying · Jure Leskovec -
2019 : Marinka Zitnik: Graph Neural Networks for Drug Discovery and Development »
Marinka Zitnik -
2019 : Presentation and Discussion: Open Graph Benchmark »
Jure Leskovec -
2019 Workshop: Graph Representation Learning »
Will Hamilton · Rianne van den Berg · Michael Bronstein · Stefanie Jegelka · Thomas Kipf · Jure Leskovec · Renjie Liao · Yizhou Sun · Petar Veličković -
2019 Poster: Hyperbolic Graph Convolutional Neural Networks »
Ines Chami · Zhitao Ying · Christopher Ré · Jure Leskovec -
2019 Poster: G2SAT: Learning to Generate SAT Formulas »
Jiaxuan You · Haoze Wu · Clark Barrett · Raghuram Ramanujan · Jure Leskovec -
2019 Poster: Retrosynthesis Prediction with Conditional Graph Logic Network »
Hanjun Dai · Chengtao Li · Connor Coley · Bo Dai · Le Song -
2019 Poster: GNNExplainer: Generating Explanations for Graph Neural Networks »
Zhitao Ying · Dylan Bourgeois · Jiaxuan You · Marinka Zitnik · Jure Leskovec -
2018 : Panel »
Paroma Varma · Aditya Grover · Will Hamilton · Jessica Hamrick · Thomas Kipf · Marinka Zitnik -
2018 : Poster Session I »
Aniruddh Raghu · Daniel Jarrett · Kathleen Lewis · Elias Chaibub Neto · Nicholas Mastronarde · Shazia Akbar · Chun-Hung Chao · Henghui Zhu · Seth Stafford · Luna Zhang · Jen-Tang Lu · Changhee Lee · Adityanarayanan Radhakrishnan · Fabian Falck · Liyue Shen · Daniel Neil · Yusuf Roohani · Aparna Balagopalan · Brett Marinelli · Hagai Rossman · Sven Giesselbach · Jose Javier Gonzalez Ortiz · Edward De Brouwer · Byung-Hoon Kim · Rafid Mahmood · Tzu Ming Hsu · Antonio Ribeiro · Rumi Chunara · Agni Orfanoudaki · Kristen Severson · Mingjie Mai · Sonali Parbhoo · Albert Haque · Viraj Prabhu · Di Jin · Alena Harley · Geoffroy Dubourg-Felonneau · Xiaodan Hu · Maithra Raghu · Jonathan Warrell · Nelson Johansen · Wenyuan Li · Marko Järvenpää · Satya Narayan Shukla · Sarah Tan · Vincent Fortuin · Beau Norgeot · Yi-Te Hsu · Joel H Saltz · Veronica Tozzo · Andrew Miller · Guillaume Ausset · Azin Asgarian · Francesco Paolo Casale · Antoine Neuraz · Bhanu Pratap Singh Rawat · Turgay Ayer · Xinyu Li · Mehul Motani · Nathaniel Braman · Laetitia M Shao · Adrian Dalca · Hyunkwang Lee · Emma Pierson · Sandesh Ghimire · Yuji Kawai · Owen Lahav · Anna Goldenberg · Denny Wu · Pavitra Krishnaswamy · Colin Pawlowski · Arijit Ukil · Yuhui Zhang -
2018 Poster: Graph Convolutional Policy Network for Goal-Directed Molecular Graph Generation »
Jiaxuan You · Bowen Liu · Zhitao Ying · Vijay Pande · Jure Leskovec -
2018 Poster: Dynamic Network Model from Partial Observations »
Elahe Ghalebi · Baharan Mirzasoleiman · Radu Grosu · Jure Leskovec -
2018 Spotlight: Graph Convolutional Policy Network for Goal-Directed Molecular Graph Generation »
Jiaxuan You · Bowen Liu · Zhitao Ying · Vijay Pande · Jure Leskovec -
2018 Spotlight: Dynamic Network Model from Partial Observations »
Elahe Ghalebi · Baharan Mirzasoleiman · Radu Grosu · Jure Leskovec -
2018 Poster: Hierarchical Graph Representation Learning with Differentiable Pooling »
Zhitao Ying · Jiaxuan You · Christopher Morris · Xiang Ren · Will Hamilton · Jure Leskovec -
2018 Spotlight: Hierarchical Graph Representation Learning with Differentiable Pooling »
Zhitao Ying · Jiaxuan You · Christopher Morris · Xiang Ren · Will Hamilton · Jure Leskovec -
2018 Poster: Embedding Logical Queries on Knowledge Graphs »
Will Hamilton · Payal Bajaj · Marinka Zitnik · Dan Jurafsky · Jure Leskovec -
2017 : Jure Leskovec, Stanford »
Jure Leskovec -
2017 Poster: Predicting Organic Reaction Outcomes with Weisfeiler-Lehman Network »
Wengong Jin · Connor Coley · Regina Barzilay · Tommi Jaakkola -
2017 Poster: Inductive Representation Learning on Large Graphs »
Will Hamilton · Zhitao Ying · Jure Leskovec -
2016 Poster: Confusions over Time: An Interpretable Bayesian Model to Characterize Trends in Decision Making »
Himabindu Lakkaraju · Jure Leskovec -
2013 Workshop: Frontiers of Network Analysis: Methods, Models, and Applications »
Edo M Airoldi · David S Choi · Aaron Clauset · Khalid El-Arini · Jure Leskovec -
2013 Poster: Nonparametric Multi-group Membership Model for Dynamic Networks »
Myunghwan Kim · Jure Leskovec -
2012 Workshop: Social network and social media analysis: Methods, models and applications »
Edo M Airoldi · David S Choi · Khalid El-Arini · Jure Leskovec -
2012 Poster: Learning to Discover Social Circles in Ego Networks »
Julian J McAuley · Jure Leskovec -
2010 Workshop: Networks Across Disciplines: Theory and Applications »
Edo M Airoldi · Anna Goldenberg · Jure Leskovec · Quaid Morris -
2010 Oral: On the Convexity of Latent Social Network Inference »
Seth A Myers · Jure Leskovec -
2010 Poster: On the Convexity of Latent Social Network Inference »
Seth A Myers · Jure Leskovec -
2009 Workshop: Analyzing Networks and Learning With Graphs »
Edo M Airoldi · Jure Leskovec · Jon Kleinberg · Josh Tenenbaum