Timezone: »
Machine learning (ML) has revolutionized a wide array of scientific disciplines, including chemistry, biology, physics, material science, neuroscience, earth science, cosmology, electronics, mechanical science. It has solved scientific challenges that were never solved before, e.g., predicting 3D protein structure, imaging black holes, automating drug discovery, and so on. Despite this promise, several critical gaps stifle algorithmic and scientific innovation in "AI for Science": (1) Unrealistic methodological assumptions or directions, (2) Overlooked scientific questions, (3) Limited exploration at the intersections of multiple disciplines, (4) Science of science, (5) Responsible use and development of AI for science.
However, very little work has been done to bridge these gaps, mainly because of the missing link between distinct scientific communities. While many workshops focus on AI for specific scientific disciplines, they are all concerned with the methodological advances within a single discipline (e.g., biology) and are thus unable to examine the crucial questions mentioned above. This workshop will fulfill this unmet need and facilitate community building; with hundreds of ML researchers beginning projects in this area, the workshop will bring them together to consolidate the fast-growing area of "AI for Science" into a recognized field.
Mon 5:00 a.m. - 5:15 a.m.
|
Opening Remark
SlidesLive Video » |
🔗 |
Mon 5:15 a.m. - 6:00 a.m.
|
AI X Science
(
Invited Talks
)
SlidesLive Video » |
Tie-Yan Liu 🔗 |
Mon 6:00 a.m. - 6:45 a.m.
|
AI X Mathematics
(
Invited Talks
)
SlidesLive Video » |
Petar Veličković 🔗 |
Mon 7:10 a.m. - 8:10 a.m.
|
Live Panel
(
Discussion Panel
)
SlidesLive Video » |
Max Welling · Bharath Ramsundar · Irina Rish · Karianne J Bergen · Pushmeet Kohli 🔗 |
Mon 9:00 a.m. - 9:45 a.m.
|
AI X Chemistry
(
Invited Talks
)
SlidesLive Video » |
Connor Coley 🔗 |
Mon 9:45 a.m. - 9:55 a.m.
|
Discovering Dynamical Parameters by Interpreting Echo State Networks
(
Poster
)
link »
SlidesLive Video »
Reservoir computing architectures known as echo state networks (ESNs) have been shown to have exceptional predictive capabilities when trained on chaotic systems. However, ESN models are often seen as black-box predictors that lack interpretability. We show that the parameters governing the dynamics of a complex nonlinear system can be encoded in the learned readout layer of an ESN. We can extract these dynamical parameters by examining the geometry of the readout layer weights through principal component analysis. We demonstrate this approach by extracting the values of three dynamical parameters ($\sigma$, $\rho$, $\beta$) from a dataset of Lorenz systems where all three parameters are varying among different trajectories. Our proposed method not only demonstrates the interpretability of the ESN readout layer but also provides a computationally inexpensive, unsupervised data-driven approach for identifying uncontrolled variables affecting real-world data from nonlinear dynamical systems.
|
Peter Lu · Marin Soljacic 🔗 |
Mon 10:00 a.m. - 10:45 a.m.
|
AI X Molecule
(
Invited Talks
)
SlidesLive Video » |
Jian Tang 🔗 |
Mon 10:45 a.m. - 10:55 a.m.
|
Scientific Argument with Supervised Learning
(
Poster
)
link »
SlidesLive Video » The use of machine learning (ML) for scientific discovery has enabled data-driven approaches to new and old questions alike. We argue that scientific arguments based on algorithms for discovery hold the potential to reinforce existing assumptions about phenomena, under the guise of testing them. Using examples from image-based biological classification, we show how scientific arguments using supervised learning can contribute to unintended, unrealistic, or under-evidenced claims. |
Jeffrey Lockhart · Abigail Jacobs 🔗 |
Mon 11:00 a.m. - 11:45 a.m.
|
AI X Cosmology
(
Invited Talks
)
SlidesLive Video » |
Shirley Ho 🔗 |
Mon 11:45 a.m. - 11:55 a.m.
|
Apertures in Agriculture Seeking Attention
(
Poster
)
link »
SlidesLive Video » Agriculture is arguably one economic activity that serves as the backbone of human civilization---It is the provider of the most essential lifeline for human survival, namely, food. Today, we stand at a juncture where world over, all key stakeholders involved in this activity, i.e., producers (farmers), consumers, and the planet are facing grave concerns. Formulating these concerns concretely and leveraging AI methods in developing solution strategies can help alleviate major risks to food systems. However, local contexts such as cultural practices, physical terrains, and socio-economic status pose unique challenges in being able to directly employ existing AI techniques across geographies. In this position paper, we highlight some such key challenges, specifically focusing on India and the developing world. We outline the problems of key stakeholders, and identify some gaps that need to be filled in addressing these problems. |
Sanchita Das · Ramya Srinivasan 🔗 |
Mon 12:00 p.m. - 12:10 p.m.
|
Uncovering motif interactions from convolutional-attention networks for genomics
(
Poster
)
link »
SlidesLive Video » A major goal of computational genomics is to understand how sequence patterns, called motifs, interact to regulate gene expression. In principle, convolution-attention networks (CANs) should provide an inductive bias to infer motif interactions; convolutions can capture motifs while self-attention learns their interactions. However, it is unclear the extent to which this is true in practice. Here we perform an empirical study on synthetic data to test the efficacy of uncovering motif interactions in CANs. We find that irrespective of design choice, interpreting local attention (i.e. on an individual sequence basis) is noisy, leading to many false positive motif interactions. To address this issue, we propose Global Interactions via Filter Activity Correlations (GLIFAC). GLIFAC robustly uncovers motif interactions across a wide spectrum of model choices. This work provides guidance on design choices for CANs that lead to better interpretability for regulatory genomics without sacrificing generalization performance. |
Rohan Ghotra · Peter Koo 🔗 |
Mon 12:10 p.m. - 12:55 p.m.
|
AI X Discovery
(
Invited Talks
)
SlidesLive Video » |
Yoshua Bengio 🔗 |
Mon 1:00 p.m. - 1:45 p.m.
|
AI X Neuroscience
(
Invited Talks
)
SlidesLive Video » |
Tomaso Poggio 🔗 |
Mon 1:45 p.m. - 1:55 p.m.
|
Awards and Closing Remark
(
Closing Remark
)
SlidesLive Video » |
🔗 |
-
|
Drug Re-positioning via Text Augmented Knowledge Graph Embeddings
(
Poster
)
link »
SlidesLive Video » Drug re-positioning, modeled as a link prediction problem over medical knowledge graphs (KG), has great potential in finding new usage or targets for approved medicine with relatively low cost. However, the semantic information in medical KGs is rarely utilized, let alone the external medical databases curated by domain experts. This work attempts to integrate textual descriptions of biomedical KG entities in training knowledge graph embeddings (KGE) and evaluates its effectiveness for drug re-positioning. We implement multiple text augmentation methods on TransE as a case study and further apply the best method on other embedding models. Both qualitative and quantitative error analyses with two novel metrics are conducted to shed light on the effects of adding textual information in our model. We conclude that textual information is generally useful, but it may also backfire. |
Mian Zhong · Tiancheng Hu · Ying Jiao · Shehzaad Dhuliawala 🔗 |
-
|
Novel fuzzy approach to Antimicrobial Peptide Activity Prediction: A tale of limited and imbalanced data that models won’t hear
(
Poster
)
link »
Antimicrobial peptides have gained immense attention in recent years due to their potential for developing novel antibacterial medicines, next-generation anti-cancer treatment regimes, etc. Owing to the significant cost and time required for wet lab-based AMP screening, researchers have framed the task as an ML problem. However, traditional models rely on the unrealistic premise of large medical data availability to achieve significant performance levels; otherwise, they overfit, decreasing model precision. The collection of such labeled medical data is a challenging and expensive task in itself. The current study is the first to examine models in a real-world setting, training them on restricted and highly imbalanced data. A Fuzzy Intelligence based model is proposed for short (<30 aa) AMP activity prediction, and its ability to learn on limited and severely skewed high-dimensional space mapping is demonstrated over a set of experiments. The proposed model significantly outperforms state-of-the-art ML models trained on the same data. The findings demonstrate the model's efficacy as a potential method for in silico AMP activity prediction. |
Aviral Chharia 🔗 |
-
|
3D Pre-training improves GNNs for Molecular Property Prediction
(
Poster
)
link »
Molecular property prediction is one of the fastest-growing applications of deep learning with critical real-world impacts. Including 3D molecular structure as input to learned models improves their predictions for many molecular properties. However, this information is infeasible to compute at the scale required by most real-world applications. We propose pre-training a model to understand the geometry of molecules given only their 2D molecular graph. Using methods from self-supervised learning, we maximize the mutual information between a 3D summary vector and the representations of a Graph Neural Network (GNN) such that they contain latent 3D information. During fine-tuning on molecules with unknown geometry, the GNN still generates implicit 3D information and can use it to inform downstream tasks. We show that 3D pre-training provides significant improvements for a wide range of molecular properties, such as a 22% average MAE reduction on eight quantum mechanical properties. Crucially, the learned representations can be effectively transferred between datasets with vastly different molecules. |
Hannes Stärk · Dominique Beaini · Gabriele Corso · Prudencio Tossou · Christian Dallago · Stephan Günnemann · Pietro Lió 🔗 |
-
|
GraphGT: Machine Learning Datasets for Graph Generation and Transformation
(
Poster
)
link »
Graph generation, which learns from known graphs and discovers novel graphs, has great potential in numerous research topics like drug design and mobility synthesis and is one of the fastest-growing domains recently due to its promise for discovering new knowledge. Though many benchmark datasets have emerged in the domain of graph representation learning, the real-world datasets for graph generation problem are much fewer and limited to a small number of areas such as molecules and citation networks. To fill the gap, we introduce GraphGT, a large dataset collection for graph generation problem in machine learning, which contains 36 datasets from 9 domains across 6 subjects. To assist the researchers with better explorations of the datasets, we provide a systemic review and classification of the datasets from various views including research tasks, graph types, and application domains. In addition, GraphGT provides an easy-to-use graph generation pipeline that simplifies the process for graph data loading, experimental setup, model evaluation. The community can query and access datasets of interest according to a specific domain, task, or type of graph. GraphGT will be regularly updated and welcome inputs from the community. GraphGT is publicly available at \url{https://graphgt.github.io/} and can also be accessed via an open Python library. |
Yuanqi Du · Shiyu Wang · Xiaojie Guo · Hengning Cao · Shujie Hu · Junji Jiang · Aishwarya Varala · Abhinav Angirekula · Liang Zhao 🔗 |
-
|
Neuroprospecting with DeepRL agents
(
Poster
)
link »
A virtuous cycle between neuroscience and deep reinforcement learning is emerging, and the AI community can do much to enable and accelerate it. |
🔗 |
-
|
Physical Benchmarking for AI-generated Cosmic Web
(
Poster
)
link »
The potential of deep learning based image-to-image translations have recently drawn a lot of attention in the scientific machine learning community. One such problem of interest is the possibility of generating physically meaningful cosmological data whilst reducing the computational cost involved in high-resolution numerical simulations. Such an effort would require optimization of neural networks beyond low order statistics like pixel-wise mean square error, and validation of results beyond visual comparisons and two-point statistics. In order to study learning-based cosmological evolution, we choose a tractable analytical prescription of Zel'dovich approximation modeled using a convolutional image translation framework called U-Net. A comprehensive list of metrics pertaining to preserving physical laws are proposed, including higher order correlation functions, conservation laws, topological indicators, dynamical robustness and statistical independence of cosmological density fields. In addition to validating AI-generated scientific datasets using rigorous physical benchmarks, this study motivates advancements in domain-specific optimization schemes for scientific machine learning. |
Xiaofeng Dong · Salman Habib · Sandeep Madireddy 🔗 |
-
|
Linear Transformations in Autoencoder Latent Space Predict Time Translations in Active Matter System
(
Poster
)
link »
Machine Learning (ML) approaches are promising for deriving dynamical predictions of physical systems from data. ML approaches are relevant in active matter, a field that spans scales and studies dynamics of far-from-equilibrium systems where there are significant challenges in predicting macroscopic behavior from microscopic interactions of active particles. A major challenge in applying ML to active systems is encoding a continuous representation of time within a neural network. In this work, we develop a framework for predicting the dynamics of active networks of protein filaments and motors by combining a low-dimensional latent representation inferred through an autoencoder with a linear shift neural network that encodes time translation as a linear transformation within the latent space. Our method enables predicting the contraction and boundary deformations of active networks with various geometries. Although our method is trained to predict 20 time steps into the future, it can generalize to periods of 60 time steps and recapitulate the past 30 frames of a single given observation with less than 10\% error. Finally, we derive an approximate analytic expression for the linear transformation in the latent space that captures the dynamics. Broadly, our study reveals that neural networks are powerful for forecasting the behavior of active matter systems in the complete absence of knowledge of the microscopic dynamics. |
Enrique Amaya · Shahriar Shadkhoo · Dominik Schildknecht · Matt Thomson 🔗 |
-
|
Transfer Learning Approaches for Knowledge Discovery in Grid-based Geo-Spatiotemporal Data
(
Poster
)
link »
Extracting and meticulously analyzing geo-spatiotemporal features is crucial to recognize intricate underlying causes of natural events, such as floods. Limited evidence about hidden factors leading to climate change makes it challenging to predict regional water discharge accurately. In addition, the explosive growth in complex geo-spatiotemporal environment data that requires repeated learning by the state-of-the-art neural networks for every new region emphasizes the need for new computationally efficient methods, advanced computational resources, and extensive training on a massive amount of available monitored data. We, therefore, propose HydroDeep, an effectively reusable pretrained model to address this problem of transferring knowledge from one region to another by effectively capturing their intrinsic geo-spatiotemporal variance. Further, we present four transfer learning approaches on HydroDeep for spatiotemporal interpretability that improve Nash–Sutcliffe efficiency by 9% to 108% in new regions with a 95% reduction in time |
Aishwarya Sarkar · Chaoqun Lu · Ali Jannesari 🔗 |
-
|
Physics-Augmented Learning: A New Paradigm Beyond Physics-Informed Learning
(
Poster
)
link »
SlidesLive Video » Integrating physical inductive biases into machine learning can improve model generalizability. We generalize the successful paradigm of physics-informed learning (PIL) into a more general framework that also includes what we term physics-augmented learning (PAL). PIL and PAL complement each other by handling discriminative and generative properties, respectively. In numerical experiments, we show that PAL performs well on examples where PIL is inapplicable or inefficient. |
Ziming Liu · Yuanqi Du · Yunyue Chen · Max Tegmark 🔗 |
-
|
Generative Neural Network Based Non-Convex Optimization Using Policy Gradients with an Application to Electromagnetic Design
(
Poster
)
link »
A generative neural network based non-convex optimization algorithm using a one-step implementation of the policy gradient method is introduced and applied to electromagnetic design. We demonstrate state-of-the-art performance of electromagnetic devices called grating couplers, with key advantages over local gradient-based optimization via the adjoint method. |
Sean Hooten 🔗 |
-
|
Single Reference Frequency Loss for Multi-frequency Wavefield Representation using Physics-Informed Neural Networks
(
Poster
)
link »
Physics-informed neural networks (PINNs) offer approximate multidimensional functional solutions to the Helmholtz equation that are flexible, require low memory, and have no limitations on the shape of the solution space. However, the neural network (NN) training can be costly and the cost will dramatically increase as we train for multi-frequency wavefields, even if we add frequency to the NN multidimensional function, as the variation of the wavefield with frequency adds complexity to the NN training. Thus, we propose a new loss function for the NN multidimensional input training that will allow us to seamlessly include frequency as a dimension. We specifically utilize the linear relation between frequency and wavenumber (a space representation) to incorporate a reference frequency scaling to the loss function. As a result, the effective wavenumber of the wavefield solution as a function of frequency remains stationary reducing the learning burden on the NN function. We demonstrate the effectiveness of this modified loss function on a layered model. |
Xinquan Huang 🔗 |
-
|
Improving Hit-finding: Multilabel Neural Architecture with DEL
(
Poster
)
link »
SlidesLive Video » DNA-Encoded Libraries (DEL thereafter) data, often with millions of data points, enables large deep learning models to make real contributions in the drug discovery process (e.g., hit-finding). The current state-of-the-art method of modeling DEL data, GCNN multiclass model, requires domain experts to create mutually exclusive classification labels from multiple selection readouts of DEL data, which is not always an ideal assumption to formulate the problem. In this work, we designed a GCNN multilabel architecture that directly models each selection data to eliminate the corresponding dependency on human expertise. We selected effective choices for key modeling components such as label reduction scheme from in silico evaluation.To assess its performance in real-world drug discovery settings, we further carried out prospective wet-lab testing where the multilabel model shows consistent improvement in hit-rate (percentage of hits in a proposed molecule list) over the current state-of-the-art multiclass model. |
Kehang Han · Steven Kearnes · Jin Xu · Wen Torng 🔗 |
-
|
Distributed Deep Learning for Persistent Monitoring of agricultural Fields
(
Poster
)
link »
Distributed deep learning algorithms have shown eminent performance in learning from data that are privately allocated between several agents. Recent advances in sensor technology have enabled the cheap collection of spatial and temporal high-resolution data for agriculture across a wide geographical area. This continuous increase in the amount of data collected has created both the opportunity for, as well as the need to deploy distributed deep learning algorithms for a wide variety of decision support tasks in agriculture. Distributed deep learning algorithms are typically divided into two major categories: centralized vs decentralized learning algorithms, depending on whether a central parameter server exists for gathering information from participating agents. In the case of rural agriculture applications, transferring a large amount of high-resolution data (e.g., images, videos) collected with IoT devices to a central server/cloud could be very expensive especially with limited communication infrastructure. This suggests the need for decentralized learning approaches, which also naturally provide some measure of privacy. Here, autoencoders are trained using a decentralized optimization algorithm to create a latent representation of growing maize plants in a large-scale field experiment involving several hundred cameras deployed in a maize genome diversity growth experiment. We trained the autoencoders for different communication network topologies of the field-deployed cameras. The feature representations from these autoencoders are then utilized to solve downstream tasks such as anomaly detection and image retrieval. Experimental results show that distributed deep learning is effective in learning from large datasets distributed among several learning agents associated with different cameras. Anomaly detection in particular was useful to make course corrections in imaging protocol and identify localized crop management. |
Yasaman Esfandiari · Koushik Nagasubramanian · Fateme Fotouhi · Patrick Schnable · Baskar Ganapathysubramanian · Soumik Sarkar 🔗 |
-
|
Advanced Methods for Connectome-Based Predictive Modeling of Human Intelligence: A Novel Approach Based on Individual Differences in Cortical Topography
(
Poster
)
link »
Individual differences in human intelligence can be modeled and predicted from in vivo neurobiological connectivity. Many established modeling frameworks for predicting intelligence, however, discard higher-order information about individual differences in brain network topology, and show only moderate performance when generalized to make predictions in out-of-sample subjects. In this paper, we propose that connectome-based predictive modeling, a common predictive modeling framework for neuroscience data, can be productively modified to incorporate information about brain network topology and individual differences via the incorporation of bagged decision trees and the network based statistic. These modifications produce a novel predictive modeling framework that leverages individual differences in cortical tractography to generate accurate regression predictions of intelligence. Network topology-based feature selection provides for natively interpretable networks as input features, increasing the model's explainability. Investigating the proposed modeling framework's efficacy, we find that advanced connectome-based predictive modeling generates neuroscience predictions that account for a significantly greater proportion of variance in intelligence than previously established methods, advancing our scientific understanding of the network architecture that underlies human intelligence. |
Evan Anderson · Anuj Nayak · Pablo Robles-Granda · Lav Varshney · Been Kim · Aron K Barbey 🔗 |
-
|
Towards trustworthy explanations with gradient-based attribution methods
(
Poster
)
link »
The low interpretability of deep neural networks (DNNs) remains a key barrier to their wide-spread adoption in the sciences. Attribution methods offer a promising solution, providing feature importance scores that serve as first-order model explanations for a given input. In practice, gradient-based attribution methods, such as saliency maps, can yield noisy importance scores depending on model architecture and training procedure. Here we explore how various regularization techniques affect model explanations with saliency maps using synthetic regulatory genomic data, which allows us to quantitatively assess the efficacy of attribution maps. Strikingly, we find that generalization performance does not imply better saliency explanations; though unlike before, we do not observe a clear tradeoff. Interestingly, we find that conventional regularization strategies, when tuned appropriately, can yield high generalization and interpretability performance, similar to what can be achieved with more sophisticated techniques, such as manifold mixup. Our work challenges the conventional knowledge that model selection should be based on test performance; another criterion is needed to sub-select models ideally suited for downstream post hoc interpretability for scientific discovery. |
Ethan Labelson · Rohit Tripathy · Peter Koo 🔗 |
-
|
AI as statistical methods for imperfect theories
(
Poster
)
link »
SlidesLive Video » Science has progressed by reasoning on what models could not predict because they were missing important ingredients. And yet without correct models, standard statistical methods for scientific evidence are not sound. Here I argue that machine-learning methodology provides solutions to ground reasoning about empirically evidence more on models’ predictions, and less on their ingredients. |
Gael Varoquaux 🔗 |
-
|
From Convolutions towards Spikes: The Environmental Metric that the Community currently Misses
(
Poster
)
link »
Today, the AI community is obsessed with state-of-the-art scores (80% papers in NeurIPS) as the major performance metrics, due to which an important parameter, i.e., the environmental metric, remains unreported. Computational capabilities were a limiting factor a decade ago; however, in foreseeable future circumstances, the challenge will be to develop environment-friendly and power-efficient algorithms. The human brain, which has been optimizing itself for almost a million years, consumes the same amount of power as a typical laptop. Therefore, developing nature-inspired algorithms is one solution to it. In this study, we show that currently used ANNs are not what we find in nature, and why, although having lower performance, spiking neural networks, which mirror the mammalian visual cortex, have attracted much interest. We further highlight the hardware gaps restricting the researchers from using spike-based computation for developing neuromorphic energy-efficient microchips on a large scale. Using neuromorphic processors instead of traditional GPUs might be more environment friendly and efficient. These processors will turn SNNs into an ideal solution for the problem. This paper presents in-depth attention highlighting the current gaps, the lack of comparative research, while proposing new research directions at the intersection of two fields- neuroscience and deep learning. Further, we define a new evaluation metric 'NATURE' for reporting the carbon footprint of AI models. |
Aviral Chharia · Shivu Chauhan 🔗 |
-
|
Identification of Enzymatic Active Sites with Unsupervised Language Modelling
(
Poster
)
link »
SlidesLive Video » The first decade of genome sequencing saw a surge in the characterisation of proteins with unknown functionality. Even still, more than 20% of proteins in well-studied model animals have yet to be identified, making the discovery of their active site one of biology's greatest difficulties. Herein, we apply a transformer architecture to a language representation of bio-catalyzed chemical reactions to learn the signal at the base of the substrate-active site atomic interactions. The language representation comprises a reaction simplified molecular-input line-entry system (SMILES) for substrate and products, complemented with amino acid (AA) sequence information for the enzyme. Defining a custom tokenizer and a score based on attention values, we show we can capture the substrate-active site interaction signal and use it to detect the location of the active site in unknown protein sequences, hence elucidating complex 3D interactions solely relying on 1D representations.We consider a Transfomer-based model, BERT, trained with different losses and analyse the performance in comparison with a statistical baseline and methods based on sequence alignments. Our approach exhibits remarkable results and is able to recover, with no supervision, 31.51% of the active site when considering co-crystallized substrate-enzyme structures as a ground truth, largely outperforming sequence alignment-based approaches. Our findings are further corroborated by docking simulations on the 3D structure of few enzymes. This work confirms the unprecedented impact of natural language processing and more specifically of the transformer architecture on domain-specific languages, paving the way to effective solutions for protein functional characterisation and bio-catalysis engineering. |
Loïc Kwate Dassi · Matteo Manica · Daniel Probst · Philippe Schwaller · Yves Gaetan Nana Teukam · Teodoro Laino 🔗 |
-
|
A Fresh Look at De Novo Molecular Design Benchmarks
(
Poster
)
link »
De novo molecular design is a thriving research area in machine learning (ML) that lacks ubiquitous, high-quality, standardized benchmark tasks. Many existing benchmark tasks do not precisely specify a training dataset or an evaluation budget, which is problematic as they can significantly affect the performance of ML algorithms. This work elucidates the effect of dataset sizes and experimental budgets on established molecular optimization methods through a comprehensive evaluation with 11 selected benchmark tasks. We observe that the dataset size and budget significantly impact all methods' performance and relative ranking, suggesting that a meaningful comparison requires more than a single benchmark setup. Our results also highlight the relative difficulty of benchmarks, implying in particular that logP and QED are poor objectives. We end by offering guidance to researchers on their choice of experiments. |
Austin Tripp · Gregor Simm · José Miguel Hernández-Lobato 🔗 |
-
|
Semi-supervised Graph Neural Network for Particle-level Noise Removal
(
Poster
)
link »
SlidesLive Video » The high instantaneous luminosity of the CERN Large Hadron Collider leads to multiple proton-proton interactions in the same or nearby bunch crossings (pileup). Advanced pileup mitigation algorithms are designed to remove this pileup particle noise and improve the performance of physics observables crucial to the science goals. This study applies the semi-supervised graph neural network to particle-level pileup noise removal, by identifying the particles produced from pileup. The graph neural network is trained on charged particles with well-known labels, which can be obtained from simulation truth information or measurements from data, and inferred on neutral particles of which such labeling is missing. This semi-supervised approach does not depend on the ground truth information from simulation and thus allows us to perform training directly on real data. The performance with this approach is found to be consistently better than widely-used domain algorithms and comparable to a fully supervised training approach. The study serves as the first attempt at applying semi-supervised learning on pileup mitigation, and opens up a new direction of fully data-driven pileup mitigation techniques. |
Tianchun Li · Shikun Liu · Nhan Tran · Mia Liu · Pan Li 🔗 |
-
|
Human-in-the-loop for a Disconnection Aware Retrosynthesis
(
Poster
)
link »
SlidesLive Video » Retrosynthesis is an approach commonly undertaken when considering the manufacture of novel molecules. During this process, a target molecule is broken down and analyzed by considering the bonds to be changed as well as the functional group interconversion. In modern computer-assisted synthesis planning tools, the predictions of these changes are typically carried out automatically. However there may be some benefit to the decision being guided by to those executing the process: typically, chemists have a clear idea where the retrosynthetic change should happen, but not how such a transformation is to be realized. Using a data-driven model, the retrosynthesis task can be further explored by giving chemists the option to explore specific disconnections. In this work, we design an approach to provide this option to those carrying out retrosynthetic analysis by adapting a transformer-based model for single-step retrosynthesis. The model takes as input a product SMILES string, in which the atoms where the transformation should occur are tagged accordingly. This model predicts precursors corresponding to a disconnection occurring in the correct location in 88.9\% of the test set reactions. The assessment with a forward prediction model shows that 76\% of the predictions are chemically correct, with 14.1\% perfectly matching the ground truth. |
Andrea Byekwaso · Philippe Schwaller · Alain C. Vaucher · Teodoro Laino 🔗 |
-
|
Learning Large-Time-Step Molecular Dynamics with Graph Neural Networks
(
Poster
)
link »
SlidesLive Video » Molecular dynamics (MD) simulation predicts the trajectory of atoms by solving Newton's equation of motion with a numeric integrator. Due to physical constraints, the time step of the integrator need to be small to maintain sufficient precision. This limits the efficiency of simulation. To this end, we introduce a graph neural network (GNN) based model, MDNet, to predict the evolution of coordinates and momentum with large time steps. In addition, MDNet can easily scale to a larger system, due to its linear complexity with respect to the system size. We demonstrate the performance of MDNet on a 4000-atom system with large time steps, and show that MDNet can predict good equilibrium and transport properties, well aligned with standard MD simulations. |
Weihao Gao · Chong Wang 🔗 |
-
|
Learning to Simulate Unseen Physical Systems with Graph Neural Networks
(
Poster
)
link »
Recently there is an increasing interest in learning to simulate the dynamics of physic systems via machine learning. However, existing approaches fail to generalize to physical substances not in the training set, such as liquids with different viscosities or elastomers with different elasticities. Here we present a machine learning method embedded with physical priors and material parameters, which we term as “Graph-based Physics Engine” (GPE), to efficiently model the physical dynamics of different substances in a wide variety of challenging scenarios. We demonstrate that GPE can generalize to different material properties not seen in the training set by simply modifying the physical parameters, and also performs well from single-step predictions to multi-step roll-out simulations. GPE provides new insights into the construction of learnable simulators and is a key step toward predicting unknown physics problems in the real world. |
Weihao Gao · Chong Wang 🔗 |
-
|
On the feasibility of small-data learning in simulation-driven engineering tasks with known mechanisms and effective data representations
(
Poster
)
link »
SlidesLive Video » The application of machine learning (ML) in scientific tasks is increasing, especially ML with structured representations in simulation-driven engineering tasks. While previous studies stuck to large-data learning, recent studies are investigating small-data learning and effective, case-specific representations, which is significant for industrial practice. This article provides a theoretical discussion for the feasibility of small-data learning with structured representations, which is then verified through the surrogate modelling of hot stamping simulations. Future directions are also discussed. |
Haosu Zhou · Hamid Attar 🔗 |
-
|
Towards Brain-to-Text Generation: Neural Decoding with Pre-trained Encoder-Decoder Models
(
Poster
)
link »
Decoding language from non-invasive brain signals is crucial in building widely applicable brain-computer interfaces (BCIs). However, most of the existing studies have focused on discriminating which one in two stimuli corresponds to the given brain image, which is far from directly generating text from neural activities. To move towards this, we first propose two neural decoding tasks with incremental difficulty. The first and simpler task is to predict a word given a brain image and a context, which is the first step towards text generation. And the second and more difficult one is to directly generate text from a given brain image and a prefix. Furthermore, to address the two tasks, we propose a general approach that leverages the powerful pre-trained encoder-decoder model to predict a word or generate a text fragment. Our model achieves 18.20% and 7.95% top-1 accuracy in a vocabulary of more than 2,000 words on average across all participants on the two tasks respectively, significantly outperforming their strong baselines. These results demonstrate the feasibility to directly generate text from neural activities in a non-invasive way. Hopefully, our work can promote practical non-invasive neural language decoders a step further. |
Shuxian Zou · Shaonan Wang · Jiajun Zhang 🔗 |
-
|
AI Methods for Designing Energy-Efficient Buildings in Urban Environments
(
Poster
)
link »
SlidesLive Video » Designing energy-efficient buildings is an essential necessity since buildings are responsible for a significant proportion of energy consumption globally. This concern is even more critical in urban environments where it is harder to understand and model energy use. Recently, Artificial Intelligence (AI) and Machine Learning (ML) have been explored for improving the energy consumption in buildings. However, the advances in AI and ML have not been fully exploited in the building design process. This paper aims to highlight the gap between the advancements in AI and its applications for energy-efficient buildings in urban environments. The article discusses opportunities in this direction and suggests future research to have buildings adapt to the ever-changing situations. |
· Hazem Hajj 🔗 |
-
|
Fragment-Based Sequential Translation for Molecular Optimization
(
Poster
)
link »
Search of novel molecular compounds with desired properties is an important problem in drug discovery. Many existing generative models for molecules operate on the atom level. We instead focus on generating molecular fragments--meaningful substructures of molecules. We construct a coherent latent representation for molecular fragments through a learned variational autoencoder (VAE) that is capable of generating diverse and meaningful fragments. Equipped with the learned fragment vocabulary, we propose Fragment-based Sequential Translation (FaST), which iteratively translates model-discovered molecules into increasingly novel molecules with high property scores. Empirical evaluation shows that FaST achieves significant improvement over state-of-the-art methods on benchmark single-objective/multi-objective molecular optimization tasks. |
Benson Chen · Xiang Fu · Regina Barzilay · Tommi Jaakkola 🔗 |
-
|
Traversing Geodesics to Grow Biological Structures
(
Poster
)
link »
Biological tissues reliably grow into precise, functional structures from simple starting states during development. Throughout the developmental process, the energy of a tissue changes depending on its natural resistance to deformations such as stretching, bending, shearing, and torsion. In this paper, we represent tissue structures as shapes and develop a mathematical framework to discover paths on the tissue shape manifold to minimize the total energy during development. We find that paths discovered by gradient descent and the geodesic algorithm outperform naive shape interpolation in energetic terms and resemble strategies observed in development. Broadly, these tools can be used to understand and compare shape transformations in biology and propose optimal strategies for synthetic tissue engineering. |
Pranav Bhamidipati · Guruprasad Raghavan · Matt Thomson 🔗 |
-
|
$\textit{Ab Initio}$ Discovery of Biological Knowledge from scRNA-Seq Data Using Machine Learning
(
Poster
)
link »
Expectations of machine learning (ML) are high for discovering new patterns in high-throughput biological data, but most such practices are accustomed to relying on existing knowledge conditions to design experiments. Investigations of the power and limitation of ML in revealing complex patterns from data without the guide of existing knowledge have been lacking. In this study, we conducted systematic experiments on such $\textit{ab initio}$ knowledge discovery with ML methods on single-cell RNA-sequencing data of early embryonic development. Results showed that a strategy combining unsupervised and supervised ML can reveal major cell lineages with minimum involvement of prior knowledge or manual intervention, and the $\textit{ab initio}$ mining enabled a new discovery of human early embryonic cell differentiation. The study illustrated the feasibility, significance, and limitation of $\textit{ab initio}$ ML knowledge discovery on complex biological problems.
|
Jiaqi Li · Fanhong Li · Sijie Chen 🔗 |
-
|
Improving the spectral resolution of fMRI signals through the temporal de-correlation approach
(
Poster
)
link »
The inherent infra-slow, narrowband signal thwarts the fMRI modality in considering as an optimal neuroimaging modality to its alternatives, e.g., EEG and MEG, in investigating the spectral character of cortical activities. To enhance the spectral resolution of fMRI signal, we put forward a novel linear transformation approach to encourage both the multivariate fMRI time series and their derived temporal derivatives to be temporal de-correlated with each other. Thorough empirical validations of our temporal de-correlation approach on multiple independent fMRI datasets are presented, along with the attached empirical comparison of several alternative methods. Throughout all employed fMRI datasets, we observe a general increment on spectral resolution of temporal de-correlated fMRI signals in terms of wider frequency bandwidth, and more distinctive spectral characters to the original signals. |
Bai Wenjun · Junichiro Yoshimoto 🔗 |
-
|
Joint Content-Context Analysis of Scientific Publications: Identifying Opportunities for Collaboration in Cognitive Science
(
Poster
)
link »
This work studies publications in the field of cognitive science and utilizes mathematical techniques to connect the analysis of the papers' content (abstracts) to the context (citation, journals). We apply hierarchical topic modeling on the abstracts and community detection algorithms on the citation network, and measure content-context discrepancy to find academic fields that study similar topics but do not cite each other or publish in the same venues. These results show a promising, systemic framework to identify opportunities for scientific collaboration in highly interdisciplinary fields such as cognitive science and machine learning. |
Harlin Lee · Jacob G Foster 🔗 |
-
|
Bayesian Optimal Experimental Design for Simulator Models of Cognition
(
Poster
)
link »
Bayesian optimal experimental design (BOED) is a methodology to identify experiments that are expected to yield informative data. Recent work in cognitive science considered BOED for computational models of human behavior with tractable and known likelihood functions. However, tractability often comes at the cost of realism; simulator models that can capture the richness of human behavior are often intractable. In this work, we combine recent advances in BOED and approximate inference for intractable models, using machine-learning methods to find optimal experimental designs, approximate sufficient summary statistics and amortized posterior distributions. Our simulation experiments on multi-armed bandit tasks show that our method results in improved model discrimination and parameter estimation, as compared to experimental designs commonly used in the literature. |
Simon Valentin · Steven Kleinegesse · Neil Bramley · Michael Gutmann · Chris Lucas 🔗 |
-
|
Scientific Language Models for Biomedical Knowledge Base Completion: An Empirical Study
(
Poster
)
link »
Biomedical knowledge graphs (KGs) hold rich information on entities such as diseases, drugs, and genes. Predicting missing links in these graphs can boost many important applications, such as drug design and repurposing. Recent work has shown that general-domain language models (LMs) can serve as "soft" KGs, and that they can be fine-tuned for the task of KG completion. In this work, we study scientific LMs for KG completion, exploring whether we can tap into their latent knowledge to enhance biomedical link prediction. We evaluate several domain-specific LMs, fine-tuning them on datasets centered on drugs and diseases that we represent as KGs and enrich with textual entity descriptions. We integrate the LM-based models with KG embedding models, using a router method that learns to assign each input example to either type of model and provides a substantial boost in performance. Finally, we demonstrate the advantage of LM models in the inductive setting with novel scientific entities. Our datasets and code are made publicly available. |
Rahul Nadkarni · David Wadden · Iz Beltagy · Noah Smith · Hanna Hajishirzi · Tom Hope 🔗 |
-
|
Bursting Scientific Filter Bubbles: Boosting Innovation via Novel Author Discovery
(
Poster
)
link »
Isolated silos of scientific research and the growing challenge of information overload limit awareness across the literature and hinder innovation. Algorithmic curation and recommendation, which often prioritize relevance, can further reinforce these informational "filter bubbles." In response, we describe Bridger, a system for facilitating discovery of scholars and their work, to explore design tradeoffs between relevant and novel recommendations. We construct a faceted representation of authors with information gleaned from their papers and inferred author personas, and use it to develop an approach that locates commonalities ("bridges") and contrasts between scientists -- retrieving partially similar authors rather than aiming for strict similarity. In studies with computer science researchers, this approach helps users discover authors considered useful for generating novel research directions, outperforming a state-of-art neural model. In addition to recommending new content, we also demonstrate an approach for displaying it in a manner that boosts researchers' ability to understand the work of authors with whom they are unfamiliar. Finally, our analysis reveals that Bridger connects authors who have different citation profiles, publish in different venues, and are more distant in social co-authorship networks, raising the prospect of bridging diverse communities and facilitating discovery. |
Jason Portenoy · Jevin West · Eric Horvitz · Daniel Weld · Tom Hope 🔗 |
-
|
Molecular Energy Learning Using Alternative Blackbox Matrix-Matrix Multiplication Algorithm for Exact Gaussian Process
(
Poster
)
link »
We present an application of the blackbox matrix-matrix multiplication (BBMM) algorithm to scale up the Gaussian Process (GP) training of molecular energies in the molecular-orbital based machine learning (MOB-ML) framework. An alternative implementation of BBMM (AltBBMM) is also proposed to train more efficiently (over four-fold speedup) with the same accuracy and transferability as the original BBMM implementation. The training of MOB-ML was limited to 220 molecules, and BBMM and AltBBMM scale the training of MOB-ML up by over 30 times to 6500 molecules (more than a million pair energies). The accuracy and transferability of both algorithms are examined on the benchmark datasets of organic molecules with 7 and 13 heavy atoms. These lower-scaling implementations of the GP preserve the state-of-the-art learning efficiency in the low-data regime while extending it to the large-data regime with better accuracy than other available machine learning works on molecular energies. |
Jiace Sun · Lixue Cheng · Thomas Miller 🔗 |
-
|
Regression modeling on DNA encoded libraries
(
Poster
)
link »
DNA encoded libraries (DELs) are pooled, combinatorial compound collections where each member is tagged with its own unique DNA barcode. DELs are used in drug discovery for early hit finding against protein targets. Recently, several groups have proposed building machine learning models with quantities derived from DEL datasets. However, DEL datasets have a low signal-to-noise ratio which makes modeling them challenging. To that end, we propose a novel graph neural network (GNN) based regression model that directly predicts enrichment scores from raw sequencing counts while accounting for multiple sources of technical variation and intrinsic assay noise. We show that our GNN regression model quantitatively outperforms standard classification approaches and can be used to find diverse sets of molecules in external virtual libraries. |
Ralph Ma · Gabriel Dreiman · Fiorella Ruggiu · Adam Riesselman · Bowen Liu · Mohammad M Sultan · Daphne Koller 🔗 |
-
|
A Search Engine for Discovery of Scientific Challenges and Directions
(
Poster
)
link »
Keeping track of scientific challenges, advances and emerging directions is a fundamental part of research. However, researchers face a flood of papers that hinders discovery of important knowledge. In biomedicine, this directly impacts human lives. To address this problem, we present a novel task of extraction and search of scientific challenges and directions, to facilitate rapid knowledge discovery. We construct and release an expert-annotated corpus of texts sampled from full-length papers, labeled with novel semantic categories that generalize across many types of challenges and directions. We focus on a large corpus of interdisciplinary work relating to the COVID-19 pandemic, ranging from biomedicine to areas such as AI and economics. We apply a model trained on our data to identify challenges and directions across the corpus and build a dedicated search engine. In experiments with 19 researchers and clinicians using our system, we outperform a popular scientific search engine in assisting knowledge discovery. Finally, we show that models trained on our resource generalize to the wider biomedical domain and to AI papers, highlighting its broad utility. We make our data, model and search engine publicly available. |
Dan Lahav · Jon Saad-Falcon · Duen Horng Chau · Diyi Yang · Eric Horvitz · Daniel Weld · Tom Hope 🔗 |
-
|
Bringing Atomistic Deep Learning to Prime Time
(
Poster
)
link »
Artificial intelligence has not yet revolutionized the design of materials and molecules. In this perspective, we identify four barriers preventing the integration of atomistic deep learning, molecular science, and high-performance computing. We outline focused research efforts to address the opportunities presented by these challenges. |
Nathan Frey · Siddharth Samsi · Bharath Ramsundar · Connor Coley 🔗 |
-
|
Scalable Geometric Deep Learning on Molecular Graphs
(
Poster
)
link »
SlidesLive Video » Deep learning in molecular and materials sciences is limited by the lack of integration between applied science, artificial intelligence, and high-performance computing. Bottlenecks with respect to the amount of training data, the size and complexity of model architectures, and the scale of the compute infrastructure are all key factors limiting the scaling of deep learning for molecules and materials. Here, we present LitMatter, a lightweight framework for scaling molecular deep learning methods. We train four graph neural network architectures on over 400 GPUs and investigate the scaling behavior of these methods. Depending on the model architecture, training time speedups up to 60x are seen. Empirical neural scaling relations quantify the model-dependent scaling and enable optimal compute resource allocation and the identification of scalable molecular geometric deep learning model implementations. |
Nathan Frey · Siddharth Samsi · Lin Li · Connor Coley 🔗 |
-
|
Scalable Bayesian Optimization Accelerates Process Optimization of Penicillin Production
(
Poster
)
link »
While Bayesian Optimization (BO) has emerged as sample-efficient optimization method for accelerating drug discovery, it has rarely been applied to the process optimization of pharmaceutical manufacturing, which traditionally has relied on human-intuition, along with trial-and-error and slow cycles of learning. The combinatorial and hierarchical complexity of such process control also introduce challenges related to high-dimensional design spaces and requirements of larger scale observations, in which BO has typically scaled poorly. In this paper, we use penicillin production as a case study to demonstrate the efficacy of BO in accelerating the optimization of typical pharmaceutical manufacturing processes. To overcome the challenges raised by high dimensionality, we apply a trust region BO approach (TuRBO) for global optimization of penicillin yield and empirically show that it outperforms other BO and random baselines. We also extend the study by leveraging BO in the context of multi-objective optimization, allowing us to further evaluate the trade-offs between penicillin yield, production time, and CO$_2$ emission as by-product. Through quantifying the performance of BO across high-dimensional and multi-objective optimization on drug production processes, we hope to popularize application of BO in this field, and encourage closer collaboration between machine learning and broader scientific communities.
|
Qiaohao Liang 🔗 |
-
|
This Earthquake Doesn't Exist
(
Poster
)
link »
This study applies Conditional Generative Adversarial Networks (cGAN) to the field of seismology. With GAN, realistic seismic waveforms can be created for various applications, such as augmenting limited seismic data or modeling, or generating realistic noise. A potential and alarming application of GAN is to generate realistic seismic signals that can cause disturbances to the international treaty banning nuclear explosions (CTBT). Results show that the generated seismic waves are nearly indistinguishable from real ones. |
Artemii Novoselov · Krisztina Sinkovics 🔗 |
-
|
Multi-task Learning with Domain Knowledge for Molecular Property Prediction
(
Poster
)
link »
Multi-task learning for molecular property prediction is becoming increasingly important in drug discovery. However, in contrast to other domains, the performance of multi-task learning in drug discovery is still not satisfying as the number of labeled data for each task is too limited, which calls for additional data to complement the data scarcity. In this paper, we study multi-task learning for molecule property prediction in a different setting, where a relation graph between different tasks is available. We first extract a dataset including ~400 tasks as well as a relation graph between different tasks. Then we systematically investigate modeling the relation between different tasks: (1) in the latent space by learning effective task representations on the task relation graph; (2) and in the output space via structured prediction methods (e.g., energy-based methods). Empirical results prove the effectiveness of our proposed approaches. |
Shengchao Liu · Meng Qu · Zuobai Zhang · Jian Tang 🔗 |
-
|
High-Dimensional Discrete Bayesian Optimization with Self-Supervised Representation Learning for Data-Efficient Materials Exploration
(
Poster
)
link »
A material exploration model based on high-dimensional discrete Bayesian optimization is introduced. Features were extracted from a large-scale database of ab-initio calculations by self-supervised representation learning. Material exploration was carried out based on 100 prior target values from 6,218 candidate materials. As a baseline, ten human experts of materials science were selected and evaluated their exploration efficiency. Under the same conditions, the proposed discrete algorithm was 1.93 times as efficient as human experts on average, while the conventional continuous algorithm could not outperform them. |
Masaki Adachi 🔗 |
-
|
Adaptive Pseudo-labeling for Quantum Calculations
(
Poster
)
link »
Machine learning models have recently shown promise in predicting molecular quantum chemical properties. However, the path to real-life adoption requires (1) learning under low-resource constraint and (2) out-of-distribution generalization to unseen, structurally diverse molecules. We observe that these two challenges originate from label scarcity issue. We hypothesize that pseudo-labeling on vast array of unlabeled molecules can serve as proxies as gold-label to greatly expand the training labeled data. The challenge in pseudo-labeling is to prevent the bad pseudo-labels from biasing the model. We develop a simple and effective strategy Pseudo-Sigma that can assign pseudo-labels, detect bad pseud-labels through evidential uncertainty, and then prevent them from biasing the model using adaptive weighting. Empirically, Pseudo-Sigma improves quantum calculations accuracy across full data, low data and out-of-distribution settings. |
Kexin Huang · Vishnu Sresht · Brajesh Rai 🔗 |
-
|
Fast Quantum Property Prediction via Deeper 2D and 3D Graph Networks
(
Poster
)
link »
Molecular property prediction is gaining increasing attention due to its diverse applications. One task of particular interests and importance is to predict quantum chemical properties without 3D equilibrium structures. This is practically favorable since obtaining 3D equilibrium structures requires extremely expensive calculations. In this work, we design a deep graph neural network to predict quantum properties by directly learning from 2D molecular graphs. In addition, we propose a 3D graph neural network to learn from low-cost conformer sets, which can be obtained with open-source tools using an affordable budget. We evaluate our methods on predicting the HOMO-LUMO energy gap of molecules. It is demonstrated that our methods obtain remarkable prediction performance. |
Meng Liu · Cong Fu · Xuan Zhang · Limei Wang · Yaochen Xie · Hao Yuan · Youzhi Luo · Zhao Xu · Shuiwang Ji 🔗 |
-
|
Flood Segmentation on Sentinel-1 SAR Imagery with Semi-Supervised Learning
(
Poster
)
link »
SlidesLive Video » Floods wreak havoc throughout the world, causing billions of dollars in damages, and uprooting communities, ecosystems and economies. The NASA Impact Emerging Techniques in Computational Intelligence (ETCI) competition on Flood Detection tasked participants with predicting flooded pixels after training with synthetic aperture radar (SAR) images in a supervised setting. We propose a semi-supervised learning pseudo-labeling scheme that derives confidence estimates from U-Net ensembles, thereby progressively improving accuracy. Concretely, we use a cyclical approach involving multiple stages (1) training an ensemble model of multiple U-Net architectures with the provided high confidence hand-labeled data and, generated pseudo labels or low confidence labels on the entire unlabeled test dataset, and then, (2) filter out quality generated labels and, (3) combine the generated labels with the previously available high confidence hand-labeled dataset. This assimilated dataset is used for the next round of training ensemble models. This cyclical process is repeated until the performance improvement plateaus. Additionally, we post process our results with Conditional Random Fields. Our approach sets a high score, and a new state-of-the-art on the Sentinel-1 dataset for the ETCI competition with 0.7654 IoU, an impressive improvement over the 0.60 IOU baseline. Our method, which we release with all the code including trained models, can also be used as an open science benchmark for the Sentinel-1 released dataset. |
Siddha Ganju · Sayak Paul 🔗 |
-
|
Multi-modal Self-supervised Pre-training for Large-scale Genome Data
(
Poster
)
link »
Open genomic regions, being accessible to regulatory proteins, could act as the on/off switch or amplifier/attenuator of gene expression, and thus reflects the defining characteristics of cell types. Many previous models make predictions from the sequence to the regulatory region, but the interaction between regulatory regions and genes could be complex and differ between cell types. Moreover, current models usually only perform well on the cell types in the training set, which are not generalizable to data-scarce scenarios. In this work, we propose a simple yet effective approach for pre-training genome data in a multi-modal and self-supervised manner, which we call GeneBERT. Specifically, we simultaneously take the 1d sequence of genome data and a 2d matrix of (transcription factors × regions) as the input, where three pre-training tasks are proposed to improve the robustness and generalizability of our model. We pre-train our model on the ATAC-seq dataset with 17 million gene sequences. We evaluate our GeneBERT on various downstream tasks, including promoter prediction, transaction factor binding sites prediction, disease risks estimation, and RNA-Splicing. Extensive experiments demonstrate the effectiveness of multi-modal and self-supervised pre-training for large-scale genome data. |
Shentong Mo · Xi Fu · Chenyang Hong · Yizhen Chen · Yuxuan Zheng · Xiangru Tang · Yanyan Lan · Zhiqiang Shen · Eric Xing 🔗 |
-
|
Multiple Sequential Learning Tasks Represented in Recurrent Neural Networks
(
Poster
)
link »
Our brain can flexibly perform a variety of sequential learning tasks including music, language, and mathematics, but the underlying mechanism hasn't been elucidated in traditional experimental and modeling studies which were designed for only one task at a time. From the computational perspective, we hypothesize that the working mechanism of a multitask model can provide a possible solution to that of brains. Therefore, we trained a single recurrent neural network to perform 8 sequential learning tasks that depend on working memory, structure extraction, categorization, and other cognitive processes. After training, the model can learn sophisticated information holding and erasing strategies to perform multitasks simultaneously. More interestingly, the model learns to reuse neurons to encode similar task features. Hopefully, this work can provide a computational platform to investigate the neural representations of cognitive sequential learning ability. |
Shaonan Wang 🔗 |
Author Information
Payal Chandak (Harvard-MIT Health Sciences and Technology)
Yuanqi Du (George Mason University)
Tianfan Fu (Georgia Institute of Technology)
Wenhao Gao (Massachusetts Institute of Technology)
Kexin Huang (Stanford University)
Shengchao Liu (MILA-UdeM)
Ziming Liu (MIT)
Gabriel Spadon (University of Sao Paulo)
Max Tegmark (MIT)
Max Tegmark is a professor doing physics and AI research at MIT, and advocates for positive use of technology as president of the Future of Life Institute. He is the author of over 250 publications as well as the New York Times bestsellers “Life 3.0: Being Human in the Age of Artificial Intelligence” and "Our Mathematical Universe: My Quest for the Ultimate Nature of Reality". His AI research focuses on intelligible intelligence. His work with the Sloan Digital Sky Survey on galaxy clustering shared the first prize in Science magazine’s “Breakthrough of the Year: 2003.”
Hanchen Wang (University of Cambridge)
Adrian Weller (Cambridge, Alan Turing Institute)
Adrian Weller is Programme Director for AI at The Alan Turing Institute, the UK national institute for data science and AI, where he is also a Turing Fellow leading work on safe and ethical AI. He is a Principal Research Fellow in Machine Learning at the University of Cambridge, and at the Leverhulme Centre for the Future of Intelligence where he is Programme Director for Trust and Society. His interests span AI, its commercial applications and helping to ensure beneficial outcomes for society. He serves on several boards including the Centre for Data Ethics and Innovation. Previously, Adrian held senior roles in finance.
Max Welling (University of Amsterdam / Qualcomm AI Research)
Marinka Zitnik (Harvard University)
More from the Same Authors
-
2021 : Therapeutics Data Commons: Machine Learning Datasets and Tasks for Drug Discovery and Development »
Kexin Huang · Tianfan Fu · Wenhao Gao · Yue Zhao · Yusuf Roohani · Jure Leskovec · Connor Coley · Cao Xiao · Jimeng Sun · Marinka Zitnik -
2021 Spotlight: Iterative Teaching by Label Synthesis »
Weiyang Liu · Zhen Liu · Hanchen Wang · Liam Paull · Bernhard Schölkopf · Adrian Weller -
2021 : GraphGT: Machine Learning Datasets for Graph Generation and Transformation »
Yuanqi Du · Shiyu Wang · Xiaojie Guo · Hengning Cao · Shujie Hu · Junji Jiang · Aishwarya Varala · Abhinav Angirekula · Liang Zhao -
2021 : Learning Disentangled Representation for Spatiotemporal Graph Generation »
Yuanqi Du · Xiaojie Guo · Hengning Cao · Yanfang (Fa Ye · Liang Zhao -
2021 : GraphGT: Machine Learning Datasets for Graph Generation and Transformation »
Yuanqi Du · Shiyu Wang · Xiaojie Guo · Hengning Cao · Shujie Hu · Junji Jiang · Aishwarya Varala · Abhinav Angirekula · Liang Zhao -
2021 : Physics-Augmented Learning: A New Paradigm Beyond Physics-Informed Learning »
Ziming Liu · Yuanqi Du · Yunyue Chen · Max Tegmark -
2021 : Multi-task Learning with Domain Knowledge for Molecular Property Prediction »
Shengchao Liu · Meng Qu · Zuobai Zhang · Jian Tang -
2021 : Adaptive Pseudo-labeling for Quantum Calculations »
Kexin Huang · Vishnu Sresht · Brajesh Rai -
2021 : Particle Dynamics for Learning EBMs »
Kirill Neklyudov · Priyank Jaini · Max Welling -
2021 : Learning Disentangled Representation for Spatiotemporal Graph Generation »
Yuanqi Du · Xiaojie Guo · Hengning Cao · Yanfang (Fa Ye · Liang Zhao -
2022 Poster: Scalable Infomin Learning »
Yanzhi Chen · weihao sun · Yingzhen Li · Adrian Weller -
2022 : GAUCHE: A Library for Gaussian Processes in Chemistry »
Ryan-Rhys Griffiths · Leo Klarner · Henry Moss · Aditya Ravuri · Sang Truong · Bojana Rankovic · Yuanqi Du · Arian Jamasb · Julius Schwartz · Austin Tripp · Gregory Kell · Anthony Bourached · Alex Chan · Jacob Moss · Chengzhi Guo · Alpha Lee · Philippe Schwaller · Jian Tang -
2022 : PIPS: Path Integral Stochastic Optimal Control for Path Sampling in Molecular Dynamics »
Lars Holdijk · Yuanqi Du · Ferry Hooft · Priyank Jaini · Berend Ensing · Max Welling -
2022 : Structure-Inducing Pre-training »
Matthew McDermott · Brendan Yap · Peter Szolovits · Marinka Zitnik -
2022 : Tabular deep learning when $d \gg n$ by using an auxiliary knowledge graph »
Camilo Ruiz · Hongyu Ren · Kexin Huang · Jure Leskovec -
2022 : MoleculeCLIP: Learning Transferable Molecule Multi-Modality Models via Natural Language »
Shengchao Liu · Weili Nie · Chengpeng Wang · Jiarui Lu · Zhuoran Qiao · Ling Liu · Jian Tang · Anima Anandkumar · Chaowei Xiao -
2022 : Program Synthesis for Integer Sequence Generation »
Natasha Butt · Auke Wiggers · Taco Cohen · Max Welling -
2022 : ChemSpacE: Interpretable and Interactive Chemical Space Exploration »
Yuanqi Du · Xian Liu · Nilay Shah · Shengchao Liu · Jieyu Zhang · Bolei Zhou -
2022 : Structure-based Drug Design with Equivariant Diffusion Models »
Arne Schneuing · Yuanqi Du · Charles Harris · Arian Jamasb · Ilia Igashov · weitao Du · Tom Blundell · Pietro Lió · Carla Gomes · Max Welling · Michael Bronstein · Bruno Correia -
2022 : Improving Molecular Pretraining with Complementary Featurizations »
Yanqiao Zhu · Dingshuo Chen · Yuanqi Du · Yingze Wang · Qiang Liu · Shu Wu -
2022 : Relational Out-of-Distribution Generalization »
Xinyu Yang · Xinyi Pan · Shengchao Liu · Huaxiu Yao -
2022 : GraphCG: Unsupervised Discovery of Steerable Factors in Graphs »
Shengchao Liu · Chengpeng Wang · Weili Nie · Hanchen Wang · Jiarui Lu · Bolei Zhou · Jian Tang -
2022 : Conformal Prediction for Resource Prioritisation in Predicting Rare and Dangerous Outcomes »
Varun Babbar · Umang Bhatt · Miri Zilka · Adrian Weller -
2023 Poster: Quasi-Monte Carlo Graph Random Features »
Isaac Reid · Adrian Weller · Krzysztof M Choromanski -
2023 Poster: Uncertainty Quantification over Graph with Conformalized Graph Neural Networks »
Kexin Huang · Ying Jin · Emmanuel Candes · Jure Leskovec -
2023 Poster: Encoding Time-Series Explanations through Self-Supervised Model Behavior Consistency »
Owen Queen · Thomas Hartvigsen · Teddy Koker · Huan He · Theodoros Tsiligkaridis · Marinka Zitnik -
2023 Poster: Restart Sampling for Improving Generative Processes »
Yilun Xu · Mingyang Deng · Xiang Cheng · Yonglong Tian · Ziming Liu · Tommi Jaakkola -
2023 Poster: Use perturbations when learning from explanations »
Juyeon Heo · Vihari Piratla · Matthew Wicker · Adrian Weller -
2023 Poster: Full-Atom Protein Pocket Design via Iterative Refinement »
ZAIXI ZHANG · Zepu Lu · Hao Zhongkai · Marinka Zitnik · Qi Liu -
2023 Poster: The Clock and the Pizza: Two Stories in Mechanistic Explanation of Neural Networks »
Ziqian Zhong · Ziming Liu · Max Tegmark · Jacob Andreas -
2023 Poster: Dense-Exponential Random Features: Sharp Positive Estimators of the Gaussian Kernel »
Valerii Likhosherstov · Krzysztof M Choromanski · Kumar Avinava Dubey · Frederick Liu · Tamas Sarlos · Adrian Weller -
2023 Poster: Diffused Redundancy in Pre-trained Representations »
Vedant Nanda · Till Speicher · John Dickerson · Krishna Gummadi · Soheil Feizi · Adrian Weller -
2023 Poster: Controlling Text-to-Image Diffusion by Orthogonal Finetuning »
Zeju Qiu · Weiyang Liu · Haiwen Feng · Yuxuan Xue · Yao Feng · Zhen Liu · Dan Zhang · Adrian Weller · Bernhard Schölkopf -
2023 Poster: Enabling tabular deep learning when $d \gg n$ with an auxiliary knowledge graph »
Camilo Ruiz · Hongyu Ren · Kexin Huang · Jure Leskovec -
2023 Poster: Certification of Distributional Individual Fairness »
Matthew Wicker · Vihari Piratla · Adrian Weller -
2023 Poster: Learning to Receive Help: Intervention-Aware Concept Embedding Models »
Mateo Espinosa Zarlenga · Katie Collins · Krishnamurthy Dvijotham · Adrian Weller · Zohreh Shams · Mateja Jamnik -
2023 Poster: The Quantization Model of Neural Scaling »
Eric Michaud · Ziming Liu · Uzay Girit · Max Tegmark -
2023 Oral: The Clock and the Pizza: Two Stories in Mechanistic Explanation of Neural Networks »
Ziqian Zhong · Ziming Liu · Max Tegmark · Jacob Andreas -
2023 Workshop: AI for Science: from Theory to Practice »
Yuanqi Du · Max Welling · Yoshua Bengio · Marinka Zitnik · Carla Gomes · Jure Leskovec · Maria Brbic · Wenhao Gao · Kexin Huang · Ziming Liu · Rocío Mercado · Miles Cranmer · Shengchao Liu · Lijing Wang -
2022 : Keynote »
Marinka Zitnik -
2022 Spotlight: Poisson Flow Generative Models »
Yilun Xu · Ziming Liu · Max Tegmark · Tommi Jaakkola -
2022 Spotlight: Lightning Talks 6B-1 »
Yushun Zhang · Duc Nguyen · Jiancong Xiao · Wei Jiang · Yaohua Wang · Yilun Xu · Zhen LI · Anderson Ye Zhang · Ziming Liu · Fangyi Zhang · Gilles Stoltz · Congliang Chen · Gang Li · Yanbo Fan · Ruoyu Sun · Naichen Shi · Yibo Wang · Ming Lin · Max Tegmark · Lijun Zhang · Jue Wang · Ruoyu Sun · Tommi Jaakkola · Senzhang Wang · Zhi-Quan Luo · Xiuyu Sun · Zhi-Quan Luo · Tianbao Yang · Rong Jin -
2022 Panel: Panel 1C-3: Towards Understanding Grokking:… & Approximation with CNNs… »
Ziming Liu · GUOHAO SHEN -
2022 : Invited Talk #4, The Fifth Paradigm of Scientific Discovery, Max Welling »
Max Welling -
2022 Workshop: New Frontiers in Graph Learning »
Jiaxuan You · Marinka Zitnik · Rex Ying · Yizhou Sun · Hanjun Dai · Stefanie Jegelka -
2022 Workshop: AI for Science: Progress and Promises »
Yi Ding · Yuanqi Du · Tianfan Fu · Hanchen Wang · Anima Anandkumar · Yoshua Bengio · Anthony Gitter · Carla Gomes · Aviv Regev · Max Welling · Marinka Zitnik -
2022 Poster: Concept Embedding Models: Beyond the Accuracy-Explainability Trade-Off »
Mateo Espinosa Zarlenga · Pietro Barbiero · Gabriele Ciravegna · Giuseppe Marra · Francesco Giannini · Michelangelo Diligenti · Zohreh Shams · Frederic Precioso · Stefano Melacci · Adrian Weller · Pietro Lió · Mateja Jamnik -
2022 Poster: Chefs' Random Tables: Non-Trigonometric Random Features »
Valerii Likhosherstov · Krzysztof M Choromanski · Kumar Avinava Dubey · Frederick Liu · Tamas Sarlos · Adrian Weller -
2022 Poster: Reinforced Genetic Algorithm for Structure-based Drug Design »
Tianfan Fu · Wenhao Gao · Connor Coley · Jimeng Sun -
2022 Poster: OpenXAI: Towards a Transparent Evaluation of Model Explanations »
Chirag Agarwal · Satyapriya Krishna · Eshika Saxena · Martin Pawelczyk · Nari Johnson · Isha Puri · Marinka Zitnik · Himabindu Lakkaraju -
2022 Poster: A Survey and Datasheet Repository of Publicly Available US Criminal Justice Datasets »
Miri Zilka · Bradley Butcher · Adrian Weller -
2022 Poster: Towards Understanding Grokking: An Effective Theory of Representation Learning »
Ziming Liu · Ouail Kitouni · Niklas S Nolte · Eric Michaud · Max Tegmark · Mike Williams -
2022 Poster: Sample Efficiency Matters: A Benchmark for Practical Molecular Optimization »
Wenhao Gao · Tianfan Fu · Jimeng Sun · Connor Coley -
2022 Poster: Graphein - a Python Library for Geometric Deep Learning and Network Analysis on Biomolecular Structures and Interaction Networks »
Arian Jamasb · Ramon Viñas Torné · Eric Ma · Yuanqi Du · Charles Harris · Kexin Huang · Dominic Hall · Pietro Lió · Tom Blundell -
2022 Poster: Self-Supervised Contrastive Pre-Training For Time Series via Time-Frequency Consistency »
Xiang Zhang · Ziyuan Zhao · Theodoros Tsiligkaridis · Marinka Zitnik -
2022 Poster: Poisson Flow Generative Models »
Yilun Xu · Ziming Liu · Max Tegmark · Tommi Jaakkola -
2021 : Particle Dynamics for Learning EBMs »
Kirill Neklyudov · Priyank Jaini · Max Welling -
2021 : General Discussion 1 - What is out of distribution (OOD) generalization and why is it important? with Yoshua Bengio, Leyla Isik, Max Welling »
Yoshua Bengio · Leyla Isik · Max Welling · Joshua T Vogelstein · Weiwei Yang -
2021 Workshop: Privacy in Machine Learning (PriML) 2021 »
Yu-Xiang Wang · Borja Balle · Giovanni Cherubin · Kamalika Chaudhuri · Antti Honkela · Jonathan Lebensold · Casey Meehan · Mi Jung Park · Adrian Weller · Yuqing Zhu -
2021 : Modeling Category-Selective Cortical Regions with Topographic Variational Autoencoders »
T. Anderson Keller · Qinghe Gao · Max Welling -
2021 Workshop: Human Centered AI »
Michael Muller · Plamen P Angelov · Shion Guha · Marina Kogan · Gina Neff · Nuria Oliver · Manuel Rodriguez · Adrian Weller -
2021 : Modeling Category-Selective Cortical Regions with Topographic Variational Autoencoders »
T. Anderson Keller · Qinghe Gao · Max Welling -
2021 Poster: Argmax Flows and Multinomial Diffusion: Learning Categorical Distributions »
Emiel Hoogeboom · Didrik Nielsen · Priyank Jaini · Patrick Forré · Max Welling -
2021 Poster: Topographic VAEs learn Equivariant Capsules »
T. Anderson Keller · Max Welling -
2021 Poster: Learning Equivariant Energy Based Models with Equivariant Stein Variational Gradient Descent »
Priyank Jaini · Lars Holdijk · Max Welling -
2021 Poster: E(n) Equivariant Normalizing Flows »
Victor Garcia Satorras · Emiel Hoogeboom · Fabian Fuchs · Ingmar Posner · Max Welling -
2021 Poster: Modality-Agnostic Topology Aware Localization »
Farhad Ghazvinian Zanjani · Ilia Karmanov · Hanno Ackermann · Daniel Dijkman · Simone Merlin · Max Welling · Fatih Porikli -
2021 Poster: Iterative Teaching by Label Synthesis »
Weiyang Liu · Zhen Liu · Hanchen Wang · Liam Paull · Bernhard Schölkopf · Adrian Weller -
2021 Oral: E(n) Equivariant Normalizing Flows »
Victor Garcia Satorras · Emiel Hoogeboom · Fabian Fuchs · Ingmar Posner · Max Welling -
2020 Workshop: Privacy Preserving Machine Learning - PriML and PPML Joint Edition »
Borja Balle · James Bell · Aurélien Bellet · Kamalika Chaudhuri · Adria Gascon · Antti Honkela · Antti Koskela · Casey Meehan · Olga Ohrimenko · Mi Jung Park · Mariana Raykova · Mary Anne Smart · Yu-Xiang Wang · Adrian Weller -
2020 Poster: Open Graph Benchmark: Datasets for Machine Learning on Graphs »
Weihua Hu · Matthias Fey · Marinka Zitnik · Yuxiao Dong · Hongyu Ren · Bowen Liu · Michele Catasta · Jure Leskovec -
2020 Poster: Graph Meta Learning via Local Subgraphs »
Kexin Huang · Marinka Zitnik -
2020 Spotlight: Open Graph Benchmark: Datasets for Machine Learning on Graphs »
Weihua Hu · Matthias Fey · Marinka Zitnik · Yuxiao Dong · Hongyu Ren · Bowen Liu · Michele Catasta · Jure Leskovec -
2020 Poster: GNNGuard: Defending Graph Neural Networks against Adversarial Attacks »
Xiang Zhang · Marinka Zitnik -
2020 Poster: Subgraph Neural Networks »
Emily Alsentzer · Samuel Finlayson · Michelle Li · Marinka Zitnik -
2020 Poster: AI Feynman 2.0: Pareto-optimal symbolic regression exploiting graph modularity »
Silviu-Marian Udrescu · Andrew Tan · Jiahai Feng · Orisvaldo Neto · Tailin Wu · Max Tegmark -
2020 Oral: AI Feynman 2.0: Pareto-optimal symbolic regression exploiting graph modularity »
Silviu-Marian Udrescu · Andrew Tan · Jiahai Feng · Orisvaldo Neto · Tailin Wu · Max Tegmark -
2020 Poster: Ode to an ODE »
Krzysztof Choromanski · Jared Quincy Davis · Valerii Likhosherstov · Xingyou Song · Jean-Jacques Slotine · Jacob Varley · Honglak Lee · Adrian Weller · Vikas Sindhwani -
2020 Demonstration: MolDesigner: Interactive Design of Efficacious Drugs with Deep Learning »
Kexin Huang · Tianfan Fu · Dawood Khan · Ali Abid · Ali Abdalla · Abubaker Abid · Lucas Glass · Marinka Zitnik · Cao Xiao · Jimeng Sun -
2019 Workshop: Privacy in Machine Learning (PriML) »
Borja Balle · Kamalika Chaudhuri · Antti Honkela · Antti Koskela · Casey Meehan · Mi Jung Park · Mary Anne Smart · Mary Anne Smart · Adrian Weller -
2019 : Poster session »
Jindong Gu · Alice Xiang · Atoosa Kasirzadeh · Zhiwei Han · Omar U. Florez · Frederik Harder · An-phi Nguyen · Amir Hossein Akhavan Rahnama · Michele Donini · Dylan Slack · Junaid Ali · Paramita Koley · Michiel Bakker · Anna Hilgard · Hailey James · Gonzalo Ramos · Jialin Lu · Jingying Yang · Margarita Boyarskaya · Martin Pawelczyk · Kacper Sokol · Mimansa Jaiswal · Umang Bhatt · David Alvarez-Melis · Aditya Grover · Charles Marx · Mengjiao (Sherry) Yang · Jingyan Wang · Gökhan Çapan · Hanchen Wang · Steffen Grünewälder · Moein Khajehnejad · Gourab Patro · Russell Kunes · Samuel Deng · Yuanting Liu · Luca Oneto · Mengze Li · Thomas Weber · Stefan Matthes · Duy Patrick Tu -
2019 : Poster Session »
Jonathan Scarlett · Piotr Indyk · Ali Vakilian · Adrian Weller · Partha P Mitra · Benjamin Aubin · Bruno Loureiro · Florent Krzakala · Lenka Zdeborová · Kristina Monakhova · Joshua Yurtsever · Laura Waller · Hendrik Sommerhoff · Michael Moeller · Rushil Anirudh · Shuang Qiu · Xiaohan Wei · Zhuoran Yang · Jayaraman Thiagarajan · Salman Asif · Michael Gillhofer · Johannes Brandstetter · Sepp Hochreiter · Felix Petersen · Dhruv Patel · Assad Oberai · Akshay Kamath · Sushrut Karmalkar · Eric Price · Ali Ahmed · Zahra Kadkhodaie · Sreyas Mohan · Eero Simoncelli · Carlos Fernandez-Granda · Oscar Leong · Wesam Sakla · Rebecca Willett · Stephan Hoyer · Jascha Sohl-Dickstein · Sam Greydanus · Gauri Jagatap · Chinmay Hegde · Michael Kellman · Jonathan Tamir · Nouamane Laanait · Ousmane Dia · Mirco Ravanelli · Jonathan Binas · Negar Rostamzadeh · Shirin Jalali · Tiantian Fang · Alex Schwing · Sébastien Lachapelle · Philippe Brouillard · Tristan Deleu · Simon Lacoste-Julien · Stella Yu · Arya Mazumdar · Ankit Singh Rawat · Yue Zhao · Jianshu Chen · Xiaoyang Li · Hubert Ramsauer · Gabrio Rizzuti · Nikolaos Mitsakos · Dingzhou Cao · Thomas Strohmer · Yang Li · Pei Peng · Gregory Ongie -
2019 : Keynote - ML »
Max Welling -
2019 Workshop: Workshop on Human-Centric Machine Learning »
Plamen P Angelov · Nuria Oliver · Adrian Weller · Manuel Rodriguez · Isabel Valera · Silvia Chiappa · Hoda Heidari · Niki Kilbertus -
2019 Poster: N-Gram Graph: Simple Unsupervised Representation for Graphs, with Applications to Molecules »
Shengchao Liu · Mehmet Demirel · Yingyu Liang -
2019 Poster: Leader Stochastic Gradient Descent for Distributed Training of Deep Learning Models »
Yunfei Teng · Wenbo Gao · François Chalus · Anna Choromanska · Donald Goldfarb · Adrian Weller -
2018 Workshop: Privacy Preserving Machine Learning »
Adria Gascon · Aurélien Bellet · Niki Kilbertus · Olga Ohrimenko · Mariana Raykova · Adrian Weller -
2018 Poster: Geometrically Coupled Monte Carlo Sampling »
Mark Rowland · Krzysztof Choromanski · François Chalus · Aldo Pacchiano · Tamas Sarlos · Richard Turner · Adrian Weller -
2018 Spotlight: Geometrically Coupled Monte Carlo Sampling »
Mark Rowland · Krzysztof Choromanski · François Chalus · Aldo Pacchiano · Tamas Sarlos · Richard Turner · Adrian Weller -
2017 : Invited talk: Challenges for Transparency »
Adrian Weller -
2017 : Closing remarks »
Adrian Weller -
2017 Symposium: Kinds of intelligence: types, tests and meeting the needs of society »
José Hernández-Orallo · Zoubin Ghahramani · Tomaso Poggio · Adrian Weller · Matthew Crosby -
2017 Poster: From Parity to Preference-based Notions of Fairness in Classification »
Muhammad Bilal Zafar · Isabel Valera · Manuel Rodriguez · Krishna Gummadi · Adrian Weller -
2017 Poster: Causal Effect Inference with Deep Latent-Variable Models »
Christos Louizos · Uri Shalit · Joris Mooij · David Sontag · Richard Zemel · Max Welling -
2017 Poster: The Unreasonable Effectiveness of Structured Random Orthogonal Embeddings »
Krzysztof Choromanski · Mark Rowland · Adrian Weller -
2017 Poster: Uprooting and Rerooting Higher-Order Graphical Models »
Mark Rowland · Adrian Weller -
2017 Poster: Bayesian Compression for Deep Learning »
Christos Louizos · Karen Ullrich · Max Welling -
2016 Workshop: Bayesian Deep Learning »
Yarin Gal · Christos Louizos · Zoubin Ghahramani · Kevin Murphy · Max Welling -
2016 Workshop: Reliable Machine Learning in the Wild »
Dylan Hadfield-Menell · Adrian Weller · David Duvenaud · Jacob Steinhardt · Percy Liang -
2016 Symposium: Machine Learning and the Law »
Adrian Weller · Thomas D. Grant · Conrad McDonnell · Jatinder Singh -
2015 Symposium: Algorithms Among Us: the Societal Impacts of Machine Learning »
Michael A Osborne · Adrian Weller · Murray Shanahan -
2015 Poster: Bayesian dark knowledge »
Anoop Korattikara Balan · Vivek Rathod · Kevin Murphy · Max Welling -
2014 Poster: Clamping Variables and Approximate Inference »
Adrian Weller · Tony Jebara -
2014 Oral: Clamping Variables and Approximate Inference »
Adrian Weller · Tony Jebara