The "Machine Learning and the Physical Sciences" workshop aims to provide a cuttingedge venue for research at the interface of machine learning (ML) and the physical sciences. This interface spans (1) applications of ML in physical sciences (“ML for physics”) and (2) developments in ML motivated by physical insights (“physics for ML”).
ML methods have had great success in learning complex representations of data that enable novel modeling and data processing approaches in many scientific disciplines. Physical sciences span problems and challenges at all scales in the universe: from finding exoplanets in trillions of sky pixels, to finding ML inspired solutions to the quantum manybody problem, to detecting anomalies in event streams from the Large Hadron Collider, to predicting how extreme weather events will vary with climate change. Tackling a number of associated dataintensive tasks including, but not limited to, segmentation, 3D computer vision, sequence modeling, causal reasoning, generative modeling, and efficient probabilistic inference are critical for furthering scientific discovery. In addition to using ML models for scientific discovery, tools and insights from the physical sciences are increasingly brought to the study of ML models.
By bringing together ML researchers and physical scientists who apply and study ML, we expect to strengthen the interdisciplinary dialogue, introduce exciting new open problems to the broader community, and stimulate the production of new approaches to solving challenging open problems in the sciences. Invited talks from leading individuals in both communities will cover the stateoftheart techniques and set the stage for this workshop, which will also include contributed talks selected from submissions. The workshop will also feature an expert panel discussion on “Physics for ML" and a breakout session dedicated to community building will serve to foster dialogue between physical science and ML research communities.
Mon 6:00 a.m.  6:10 a.m.

Session 1  Opening remarks
(Live intro)
SlidesLive Video » 
🔗 
Mon 6:10 a.m.  6:35 a.m.

Session 1  Invited talk: Max Welling, "Accelerating simulations of nature, both classical and quantum, with equivariant deep learning"
(Invited talk (live))
SlidesLive Video » 
Max Welling · Atilim Gunes Baydin 🔗 
Mon 6:35 a.m.  6:45 a.m.

Session 1  Invited talk Q&A: Max Welling
(Live Q&A)

🔗 
Mon 6:45 a.m.  7:10 a.m.

Session 1  Invited talk: Bingqing Cheng, "Predicting material properties with the help of machine learning"
(Invited talk (live))
SlidesLive Video » 
Bingqing Cheng · Atilim Gunes Baydin 🔗 
Mon 7:10 a.m.  7:20 a.m.

Session 1  Invited talk Q&A: Bingqing Cheng
(Live Q&A)

🔗 
Mon 7:20 a.m.  7:35 a.m.

Session 1  Contributed talk: Tian Xie, "Crystal Diffusion Variational Autoencoder for Periodic Material Generation"
(Contributed talk (live))
SlidesLive Video » Generating the periodic structure of stable materials is a longstanding challenge for the material design community. This task is difficult because stable materials only exist in a lowdimensional subspace of all possible periodic arrangements of atoms: 1) the coordinates must lie in the local energy minimum defined by quantum mechanics, and 2) different atom types have complex, yet specific bonding preferences. Existing methods fail to incorporate these factors and often lack proper invariances. We propose a Crystal Diffusion Variational Autoencoder (CDVAE) that captures the physical inductive bias of material stability. By learning from the data distribution of stable materials, the decoder generates materials in a diffusion process that moves atomic coordinates towards a lower energy state and updates atom types to satisfy bonding preferences between neighbors. Our model also explicitly encodes interactions across periodic boundaries and respects permutation, translation, rotation, and periodic invariances. We generate significantly more realistic materials than past methods in two tasks: 1) reconstructing the input structure, and 2) generating valid, diverse, and realistic materials. Our contribution also includes the creation of several standard datasets and evaluation metrics for the broader machine learning community. 
Tian Xie · Atilim Gunes Baydin 🔗 
Mon 7:35 a.m.  9:05 a.m.

Session 1  Poster session (Poster session (Gather.town)) link »  🔗 
Mon 9:05 a.m.  9:10 a.m.

Session 2  Opening remarks
(Live intro)

🔗 
Mon 9:10 a.m.  10:10 a.m.

Session 2  Panel discussion: Jennifer Chayes, Marylou Gabrié, Michela Paganini, Sara Solla, Moderator: Lenka Zdeborová
(Live panel discussion)
SlidesLive Video » 
🔗 
Mon 10:10 a.m.  10:35 a.m.

Session 2  Invited talk: Megan Ansdell, "NASA's efforts & opportunities to support ML in the Physical Sciences"
(Invited talk (live))
SlidesLive Video » 
Megan Ansdell · Atilim Gunes Baydin 🔗 
Mon 10:35 a.m.  10:45 a.m.

Session 2  Invited talk Q&A: Megan Ansdell
(Live Q&A)

🔗 
Mon 10:45 a.m.  11:00 a.m.

Session 2  Contributed talk: George Stein, "Selfsupervised similarity search for large scientific datasets"
(Contributed talk (live))
SlidesLive Video » We present the use of selfsupervised learning to explore and exploit large unlabeled datasets. Focusing on 42 million galaxy images from the latest data release of the Dark Energy Spectroscopic Instrument (DESI) Legacy Imaging Surveys, we first train a selfsupervised model to distil lowdimensional representations that are robust to symmetries, uncertainties, and noise in each image. We then use the representations to construct and publicly release an interactive semantic similarity search tool. We demonstrate how our tool can be used to rapidly discover rare objects given only a single example, increase the speed of crowdsourcing campaigns, flag bad data, and construct and improve training sets for supervised applications. While we focus on images from sky surveys, the technique is straightforward to apply to any scientific dataset of any dimensionality. The similarity search web app can be found at: https://github.com/georgestein/galaxy_search 
George Stein · Atilim Gunes Baydin 🔗 
Mon 11:00 a.m.  12:30 p.m.

Session 2  Poster session (Poster session (Gather.town)) link »  🔗 
Mon 12:30 p.m.  12:35 p.m.

Session 3  Opening remarks
(Live intro)

🔗 
Mon 12:35 p.m.  1:00 p.m.

Session 3  Invited talk: Surya Ganguli, "From the geometry of high dimensional energy landscapes to optimal annealing in a dissipative many body quantum optimizer"
(Invited talk (live))
SlidesLive Video » 
Surya Ganguli · Atilim Gunes Baydin 🔗 
Mon 1:00 p.m.  1:10 p.m.

Session 3  Invited talk Q&A: Surya Ganguli
(Live Q&A)

🔗 
Mon 1:10 p.m.  1:35 p.m.

Session 3  Invited talk: Laure Zanna, "The future of climate modeling in the age of machine learning"
(Invited talk (live))
SlidesLive Video » 
Laure Zanna · Atilim Gunes Baydin 🔗 
Mon 1:35 p.m.  1:45 p.m.

Session 3  Invited talk Q&A: Laure Zanna
(Live Q&A)

🔗 
Mon 1:45 p.m.  2:00 p.m.

Session 3  Contributed talk: Maximilian Dax, "Amortized Bayesian inference of gravitational waves with normalizing flows"
(Contributed talk (live))
SlidesLive Video » Gravitational waves (GWs) detected by the LIGO and Virgo observatories encode descriptions of their astrophysical progenitors. To characterize these systems, physical GW signal models are inverted using Bayesian inference coupled with stochastic samplersa task that can take O(day) for a typical binary black hole. Several recent efforts have attempted to speed this up by using normalizing flows to estimate the posterior distribution conditioned on the observed data. In this study, we further develop these techniques to achieve results nearly indistinguishable from standard samplers when evaluated on real GW data, with inference times of one minute per event. This is enabled by (i) incorporating detector nonstationarity from event to event by conditioning on a summary of the noise characteristics, (ii) using an embedding network adapted to GW signals to compress data, and (iii) adopting a new inference algorithm that makes use of underlying physical equivariances. 
Maximilian Dax · Atilim Gunes Baydin 🔗 
Mon 2:00 p.m.  3:00 p.m.

Session 3  Community development breakouts (Community breakout session (Gather.town)) link »  🔗 
Mon 3:00 p.m.  3:30 p.m.

Session 3  Feedback from community development breakouts and closing remarks
(Live feedback)

🔗 


Flood Segmentation on Sentinel1 SAR Imagery with SemiSupervised Learning
(Poster)
SlidesLive Video » Floods wreak havoc throughout the world, causing billions of dollars in damages, and uprooting communities, ecosystems and economies. The NASA Impact Emerging Techniques in Computational Intelligence (ETCI) competition on Flood Detection tasked participants with predicting flooded pixels after training with synthetic aperture radar (SAR) images in a supervised setting. We propose a semisupervised learning pseudolabeling scheme that derives confidence estimates from UNet ensembles, thereby progressively improving accuracy. Concretely, we use a cyclical approach involving multiple stages (1) training an ensemble model of multiple UNet architectures with the provided high confidence handlabeled data and, generated pseudo labels or low confidence labels on the entire unlabeled test dataset, and then, (2) filter out quality generated labels and, (3) combine the generated labels with the previously available high confidence handlabeled dataset. This assimilated dataset is used for the next round of training ensemble models. This cyclical process is repeated until the performance improvement plateaus. Additionally, we post process our results with Conditional Random Fields. Our approach sets a high score, and a new stateoftheart on the Sentinel1 dataset for the ETCI competition with 0.7654 IoU, an impressive improvement over the 0.60 IOU baseline. Our method, which we release with all the code including trained models, can also be used as an open science benchmark for the Sentinel1 released dataset. 
Siddha Ganju · Sayak Paul 🔗 


A General Method for Calibrating Stochastic Radio Channel Models with Kernels
(Poster)
SlidesLive Video » Characterization of the environment in which communication is taking place, termed the \emph{radio channel}, is imperative for the design and analysis of communication systems. Stochastic models of the radio channel are widely used simulation tools that construct a probabilistic model of the radio channel. Calibrating these models to new measurement data is challenging when the likelihood function is intractable. The standard approach to this problem involves sophisticated algorithms for extraction and clustering of multipath components, following which, point estimates of the model parameters can be obtained using specialized estimators. We propose instead an approximate Bayesian computation algorithm based on the maximum mean discrepancy with a kernel careful crafted for this task. The proposed method is able to estimate the parameters of the model accurately in simulations, and has the advantage that it can be used on a wide range of models. 
Ayush Bharti · FrancoisXavier Briol 🔗 


A MultiSurvey Dataset and Benchmark for First Break Picking in Hard Rock Seismic Exploration
(Poster)
SlidesLive Video » Seismic surveys are a valuable source of information for mineral exploration activities. We introduce a reflection seismic survey dataset acquired at four distinct hard rock mining sites to stimulate the development of new seismic data interpretation approaches. In particular, we provide annotations as well as a sound benchmarking methodology to evaluate the transferability of supervised first break picking solutions on our dataset. We train and evaluate a baseline solution based on a UNet and discuss potential improvements to this approach. 
PierreLuc StCharles · Joumana Ghosn 🔗 


Using Mask RCNN to detect and mask ghosting and scatteredlight artifacts in astronomical images
(Poster)
Widefield astronomical surveys are often affected by the presence of undesirable reflections (often known as 
Dimitrios Tanoglidis · Aleksandra Ciprijanovic 🔗 


Neural density estimation and uncertainty quantification for laser induced breakdown spectroscopy spectra
(Poster)
Constructing probability densities for inference in highdimensional spectral data is often intractable. In this work, we use normalizing flows on structured spectral latent spaces to estimate such densities, enabling downstream inference tasks. In addition, we evaluate a method for uncertainty quantification when predicting unobserved state vectors associated with each spectrum. We demonstrate the capability of this approach on laserinduced breakdown spectroscopy data collected by the ChemCam instrument on the Mars rover Curiosity. Using our approach, we are able to generate realistic spectral samples and to accurately predict state vectors with associated wellcalibrated uncertainties. We anticipate that this methodology will enable efficient probabilistic modeling of spectral data, leading to potential advances in several areas, including outofdistribution detection and sensitivity analysis. 
Katiana Kontolati · Nishant Panda · Diane Oyen 🔗 


Robustness of deep learning algorithms in astronomy  galaxy morphology studies
(Poster)
Deep learning models are being increasingly adopted in wide array of scientific domains, especially to handle highdimensionality and volume of the scientific data. However, these models tend to be brittle due to their complexity and overparametrization, especially to the adversarial perturbations that can appear due to common image processing such as compression or blurring that are often seen with real scientific data. It is crucial to understand this brittleness and develop models robust to these adversarial perturbations. To this end, we study the effect of observational noise from the exposure time, as well as the worst case scenario of a onepixel attack as a proxy for compression or telescope errors on performance of ResNet18 trained to distinguish between galaxies of different morphologies in LSST mock data. We also explore how domain adaptation techniques can help improve model robustness in case of this type of naturally occurring attacks and help scientists build more trustworthy and stable models. 
Aleksandra Ciprijanovic · Diana Kafkes · Gabriel Nathan Perdue · Sandeep Madireddy · Stefan Wild · Brian Nord 🔗 


Uncertainty quantification for ptychography using normalizing flows
(Poster)
SlidesLive Video » Ptychography, as an essential tool for highresolution and nondestructive material characterization, presents a challenging largescale nonlinear and nonconvex inverse problem; however, its intrinsic photon statistics create clear opportunities for statisticalbased deep learning approaches to tackle these challenges, which has been underexplored. In this work, we explore normalizing flows to obtain a surrogate for the highdimensional posterior, which also enables the characterization of the uncertainty associated with the reconstruction: an extremely desirable capability when judging the reconstruction quality in the absence of ground truth, spotting spurious artifacts and guiding future experiments using the returned uncertainty patterns. We demonstrate the performance of the proposed method on a synthetic sample with added noise and in various physical experimental settings. 
Agnimitra Dasgupta 🔗 


Model Inversion for Spatiotemporal Processes using the Fourier Neural Operator
(Poster)
SlidesLive Video » We explore model inversion using the Fourier Neural Operator (FNO) of Li et al. The approach learns a FNO emulator of the partial differential equation forward operator from simulated realisations and then the latent inputs (physical system parameters) are selected by solving an optimisation to match a set of observations. Our results suggest that this underdetermined inverse problem is substantially harder but by careful regularisation we are able to improve our inference substantially. 
Daniel MacKinlay · Daniel Pagendam · Petra Kuhnert 🔗 


MixtureofExperts Ensemble with Hierarchical Deep Metric Learning for Spectroscopic Identification
(Poster)
A mixtureofexperts ensemble of hierarchical deep metric learning models is introduced in order to identify materials from Xray diffraction spectra. In previous studies, the identification accuracy of the 1D convolutional neural networks model deteriorates significantly as the number of classes increases. To overcome this problem, a hierarchical deep metric learning model was developed that can identify approximately 10,000 classes with an average top1 accuracy of 87%. Furthermore, this new model was employed to create expert models for 73 general chemical elements, which in turn were used to construct a mixtureofexperts ensemble. This ensemble model successfully identified materials from 136,899 classes with a top1 accuracy of 98%. 
Masaki Adachi 🔗 


Amortized Variational Inference for Type Ia Supernova Light Curves
(Poster)
SlidesLive Video » Markov Chain Monte Carlo (MCMC) methods are widely used for Bayesian inference in astronomy. However, when applied to datasets coming from nextgeneration telescopes, inference becomes computationally expensive. We propose using amortized variational inference to estimate the posterior of a supernova light curve parametric model. We show that amortization with a recurrent neural network is significantly faster than MCMC while providing competitive estimates of the predictive distribution. To the best of our knowledge, this is the first time this fast amortized framework is applied to astronomical light curves. This approach will be essential when estimating the posterior of astrophysical parameters for terabytes of data per night that nextgeneration telescopes will produce. 
Alexis Sánchez · Pablo And Huijse · Francisco Förster · Guillermo CabreraVives 🔗 


Scalable Bayesian Optimization Accelerates Process Optimization of Penicillin Production
(Poster)
While Bayesian Optimization (BO) has emerged as sampleefficient optimization method for accelerating drug discovery, it has rarely been applied to the process optimization of pharmaceutical manufacturing, which traditionally has relied on humanintuition, along with trialanderror and slow cycles of learning. The combinatorial and hierarchical complexity of such process control also introduce challenges related to highdimensional design spaces and requirements of larger scale observations, in which BO has typically scaled poorly. In this paper, we use penicillin production as a case study to demonstrate the efficacy of BO in accelerating the optimization of typical pharmaceutical manufacturing processes. To overcome the challenges raised by high dimensionality, we apply a trust region BO approach (TuRBO) for global optimization of penicillin yield and empirically show that it outperforms other BO and random baselines. We also extend the study by leveraging BO in the context of multiobjective optimization, allowing us to further evaluate the tradeoffs between penicillin yield, production time, and CO$_2$ emission as byproduct. Through quantifying the performance of BO across highdimensional and multiobjective optimization on drug production processes, we hope to popularize application of BO in this field, and encourage closer collaboration between machine learning and broader scientific communities.

Qiaohao Liang 🔗 


Neural Symplectic Integrator with Hamiltonian Inductive Bias for the Gravitational Nbody Problem
(Poster)
SlidesLive Video »
The gravitational $N$body problem, which is fundamentally important in astrophysics to predict the motion of $N$ celestial bodies under the mutual gravity of each other, is usually solved numerically because there is no known general analytical solution for $N>2$. Can an $N$body problem be solved accurately by a neural network (NN)? Can a NN observe longterm conservation of energy and orbital angular momentum? Inspired by Wistom \& Holman (1991) symplectic map, we present a neural $N$body integrator for splitting the Hamiltonian into a twobody part, solvable analytically, and an interaction part that we approximate with a NN. Our neural symplectic $N$body code integrates a general threebody system at $\mathcal{O}(N)$ complexity for $10^{5}$ steps without diverting from the ground truth dynamics obtained from a traditional $N$body integrator. Moreover, it exhibits good inductive bias by successfully predicting the dynamical evolution of $N$body systems that are no part of the training set.

Maxwell Cai · Simon Portegies Zwart · Damian Podareanu 🔗 


Using neural networks to reduce communication in numerical solution of partial differential equations
(Poster)
Highperformance computing (HPC) applications are frequently communicationbound and so are unable to take advantage of the full extent of compute resources available on a node. Examples abound in scientific computing, where largescale partial differential equations (PDEs) are solved on hundreds to thousands of nodes. The vast majority of these problems rely on meshbased discretization techniques and on the calculation of fluxes across element boundaries. The mathematical expression for those fluxes is based on data that is available in the local memory and on neighboring data transferred from another compute node. That data transfer can account for a significant percentage of the simulation time and energy consumption. We present algorithmic approaches for replacing data transfers with local computations, potentially leading to a reduction in simulation cost and avenues for kernel acceleration that would otherwise not be worthwhile. The communication cost can be reduced by up to 50%, with limited impact on physical simulation accuracy. 
Laurent White · Ganesh Dasika 🔗 


Modeling Advection on Directed Graphs using Mat\'{e}rn Gaussian Processes for Traffic Flow
(Poster)
SlidesLive Video » The transport of traffic flow can be modeled by the advection equation. Finite difference and finite volumes methods have been used to numerically solve this hyperbolic equation on a mesh. Advection has also been modeled discretely on directed graphs using the graph advection operator [4, 18]. In this paper, we first show that we can reformulate this graph advection operator as a finite difference scheme. We then propose the Directed Graph Advection Matérn Gaussian Process (DGAMGP) model that incorporates the dynamics of this graph advection operator into the kernel of a trainable Matérn Gaussian Process to effectively model traffic flow and its uncertainty as an advective process on a directed graph. 
Nadim Saad · Danielle Maddix · Bernie Wang 🔗 


Unsupervised Spectral Unmixing for Telluric Correction using a Neural Network Autoencoder
(Poster)
SlidesLive Video » The absorption of light by molecules in the atmosphere of Earth is a complication for groundbased observations of astrophysical objects. Comprehensive information on various molecular species is required to correct for this so called telluric absorption. We present a neural network autoencoder approach for extracting a telluric transmission spectrum from a large set of highprecision observed solar spectra from the HARPSN radial velocity spectrograph. We accomplish this by reducing the data into a compressed representation, which allows us to unveil the underlying solar spectrum and simultaneously uncover the different modes of variation in the observed spectra relating to the absorption of H2O and O2 in the atmosphere of Earth. We demonstrate how the extracted components can be used to remove H2O and O2 tellurics in a validation observation with similar accuracy and at less computational expense than a synthetic approach with molecfit. 
Rune Kjærsgaard · Line Clemmensen 🔗 


S3RP: SelfSupervised SuperResolution and Prediction for AdvectionDiffusion Process
(Poster)
SlidesLive Video » We present a superresolution model for an advectiondiffusion process with limited information. While most of the superresolution models assume highresolution (HR) groundtruth data in the training, in many cases such HR dataset is not readily accessible. Here, we show that a Recurrent Convolutional Network trained with physicsbased regularizations is able to reconstruct the HR information without having the HR groundtruth data. Moreover, considering the illposed nature of a superresolution problem, we employ the Recurrent Wasserstein Autoencoder to model the uncertainty. 
Chulin Wang · Kyongmin Yeo · Andres Codas · Xiao Jin · Bruce Elmegreen · kleinl 🔗 


Automatically detecting anomalous exoplanet transits
(Poster)
SlidesLive Video » Raw light curve data from exoplanet transits is too complex to naively apply traditional outlier detection methods. We propose an architecture which estimates a latent representation of both the main transit and residual deviations with a pair of variational autoencoders. We show, using two fabricated datasets, that our latent representations of anomalous transit residuals are significantly more amenable to outlier detection than raw data or the latent representation of a traditional variational autoencoder. We then apply our method to real exoplanet transit data. Our study is the first which automatically identifies anomalous exoplanet transit light curves. We additionally release three firstoftheirkind datasets to enable further research. 
Christoph Hönes · Benjamin K Miller 🔗 


DeepZipper: A Novel Deep Learning Architecture for Lensed Supernovae Identification
(Poster)
SlidesLive Video » The identification of gravitationally lensed supernovae in modern astronomical datasets is a needleinahaystack problem with dramatic scientific implications: discovered systems can be used to directly measure and resolve the current tension on the value of the expansion rate of the Universe today. We hypothesize that the imagebased features of the gravitational lensing and the temporalbased features of the timevarying brightness are equally important in classifications. We therefore develop a deep learning technique that utilizes long shortterm memory cells for the timevarying brightness of astronomical systems and convolutional layers for the raw images of astronomical systems simultaneously, and then concatenates the feature maps with multiple fully connected layers. This novel approach achieves a receiver operating characteristic area under curve of 0.97 on simulated astronomical data and more importantly outperforms standalone versions of its recurrent and convolutional constituents. We find that combining recurrent and convolutional layers within one coherent network architecture allows the network to optimally weight and aggregate the temporal and image features to yield a promising tool for lensed supernovae identification. 
Robert Morgan · Brian Nord 🔗 


Using physicsinformed regularization to improve extrapolation capabilities of neural networks
(Poster)
Neuralnetworkbased surrogate models, which replace (parts of) a physicsbased simulator, are attractive for their efficiency, yet they suffer from a lack of extrapolation capability. Focusing on the wave equation, we investigate the use of several physicsbased regularization terms in the loss function as a way to increase the extrapolation accuracy, together with assessing the impact of a term that conditions the neural network to weakly satisfy the boundary conditions. These regularization terms do not require any labeled data. By gradually incorporating the regularization terms while training, we achieve a more than 5X reduction in extrapolation error compared to a baseline (i.e., physicsless) neural network that is trained with the same set of labeled data. We map out future research directions, and provide some insights about leveraging the trained neuralnetwork state for devising sampling strategies. 
Ganesh Dasika · Laurent White 🔗 


Single Image SuperResolution with Uncertainty Estimation for Lunar Satellite Images
(Poster)
Recently, there has been a renewed interest in returning to the Moon, with many planned missions targeting the south pole. This region is of high scientific and commercial interest, mostly due to the presence of waterice and other volatiles which could enable our sustainable presence on the Moon and beyond. In order to plan safe and effective crewed and robotic missions, access to highresolution (<0.5 m) surface imagery is critical. However, the overwhelming majority (99.7%) of existing images over the south pole have spatial resolutions >1 m. In order to obtain better images, the only currently available way is to launch a new satellite mission to the Moon with better equipment to gather more precise data. In this work we develop an alternative that can be used directly on previously gathered data and therefore saving a lot of resources. It consist of a single image superresolution (SR) approach based on generative adversarial networks that is able to superresolve existing images from 1 m to 0.5 m resolution, unlocking a large catalogue of images (∼50,000) for a more accurate mission planning in the region of interest for the upcoming missions. We show that our enhanced images reveal previously unseen hazards such as small craters and boulders, allowing safer traverse planning. Our approach also includes uncertainty estimation, which allows mission planners to understand the reliability of the superresolved images. 
Jose DelgadoCenteno · Paula Harder · Ben Moseley · Valentin Bickel · Siddha Ganju · Miguel Olivares · Alfredo Kalaitzis 🔗 


Deep learning techniques for a realtime neutrino classifier
(Poster)
SlidesLive Video » The ARIANNA experiment is a detector designed to record radio signals created by highenergy neutrino interactions in the Antarctic ice. Because of the low neutrino rate at high energies, the physics output is limited by statistics. Hence, an increase in detector sensitivity significantly improves the interpretation of data and offers the ability to probe new physics. The trigger thresholds of the detector are limited by the rate of triggering on unavoidable noise. A realtime noise rejection algorithm enables the thresholds to be lowered substantially and increases the sensitivity of the detector by up to a factor of two compared to the current ARIANNA capabilities. Deep learning discriminators based on Fully Connected Neural Networks (FCNN) and Convolutional Neural Networks (CNN) are evaluated for their ability to reject a high percentage of noise events (while retaining most of the neutrino signal) and to classify events quickly. In particular, we describe a CNN trained on Monte Carlo data that runs on the current ARIANNA microcontroller and retains 95% of the neutrino signal at a noise rejection factor of 10^5. 
Astrid Anker 🔗 


Deep learning reconstruction of the neutrino energy with a shallow Askaryan detector
(Poster)
SlidesLive Video » Highenergy neutrinos (above a few \SI{e16}{eV}) can be detected cost effective with a spare array of radio detector stations installed in polar ice sheets. The technology has been explored successfully in pilotarrays. A large radio detector is currently being constructed in Greenland with the potential to measure the first cosmogenic neutrino, and an orderofmagnitude more sensitive detector is being planned with IceCubeGen2. We present the first endtoend reconstruction of the neutrino energy from radio detector data. NuRadioMC was used to create a large data set of 40 million events of expected radio signals that are generated via the Askaryan effect following a neutrino interaction in the ice for a broad range of neutrino energies between \SI{100}{PeV} and \SI{10}{EeV}. We simulated the voltage traces that would be measured by the five antennas of a shallow detector station in the presence of noise. We designed and trained a deep neural network to determine the shower energy directly from the simulated experimental data and achieve a resolution better than a factor of two (STD < 0.3 in log10(E)) which is below the irreducible uncertainty from inelasticity fluctuations. We present the model architecture and study the dependence of the resolution on event parameters. This method will enable Askaryan detectors to measure the neutrino energy. 
Stephen McAleer · Christian Glaser · Pierre Baldi 🔗 


Inferring dark matter substructure with global astrometry beyond the power spectrum
(Poster)
Astrometric lensing has recently emerged as a promising avenue for characterizing the population of dark matter clumpssubhalosin our Galaxy. Leveraging recent advances in simulationbased inference and neural network architectures, we introduce a novel method to look for global dark matterinduced lensing signatures in astrometric datasets. Our method shows significantly greater sensitivity to a cold dark matter population compared to existing approaches, establishing machine learning as a powerful tool for characterizing dark matter using astrometric data. 
Siddharth MishraSharma 🔗 


Stochastic Adversarial Koopman Model for Dynamical Systems
(Poster)
SlidesLive Video »
Dynamical systems are ubiquitous and are often modeled using a nonlinear system of governing equations. Numerical solution procedures for many dynamical systems have existed for several decades, but can be slow due to highdimensional state space of the dynamical system. Thus, deep learningbased reduced order models (ROMs) are of interest and one such family of algorithms along these lines are based on the Koopman theory. This paper extends a recently developed adversarial Koopman model (Balakrishnan \& Upadhyay, arXiv:2006.05547) to stochastic space, where the Koopman operator applies on the probability distribution of the latent encoding of an encoder. Specifically, the latent encoding of the system is modeled as a Gaussian, and is advanced in time by using an auxiliary neural network that outputs two Koopman matrices $K_{\mu}$ and $K_{\sigma}$. Adversarial and gradient losses are used and this is found to lower the prediction errors. A reduced Koopman formulation is also undertaken where the Koopman matrices are assumed to have a tridiagonal structure, and this yields predictions comparable to the baseline model with full Koopman matrices. The efficacy of the stochastic Koopman model is demonstrated on different test problems in chaos, fluid dynamics, combustion, and reactiondiffusion models. The proposed model is also applied in a setting where the Koopman matrices are conditioned on other input parameters for generalization and this is applied to simulate the state of a Lithiumion battery in time. The Koopman models discussed in this study are very promising for the wide range of problems considered.

Kaushik Balakrishnan · devesh upadhyay 🔗 


Discovering PDEs from Multiple Experiments
(Poster)
SlidesLive Video » Automated model discovery of partial differential equations (PDEs) usually considers a single experiment or dataset to infer the underlying governing equations. In practice, experiments have inherent natural variability in parameters, initial and boundary conditions that cannot be simply averaged out. We introduce a randomised adaptive group Lasso sparsity estimator to promote grouped sparsity and implement it in a deep learning based PDE discovery framework. It allows to create a learning bias that implies the a priori assumption that all experiments can be explained by the same underlying PDE terms with potentially different coefficients. Our experimental results show more generalizable PDEs can be found from multiple highly noisy datasets, by this grouped sparsity promotion rather than simply performing independent model discoveries. 
Georges Tod · GertJan Both · Remy Kusters 🔗 


Nonlinear pileup separation with LSTM neural networks for cryogenic particle detectors
(Poster)
In highbackground or calibration measurements with cryogenic particle detectors, a significant share of the exposure is lost due to pileup of recoil events. We propose a method for the separation of pileup events with an LSTM neural network and evaluate its performance on an exemplary data set. Despite a nonlinear detector response function, we can reconstruct the ground truth of a severely distorted energy spectrum reasonably well. 
Felix Wagner 🔗 


Machine learning accelerated particleincell plasma simulations
(Poster)
ParticleInCell (PIC) methods are frequently used for kinetic, highfidelity simulations of plasmas. Implicit formulations of PIC algorithms feature strong conservation properties, up to numerical roundoff errors, and are not subject to timestep limitations which make them an attractive candidate to use in simulations fusion plasmas. Currently they remain prohibitively expensive for highfidelity simulation of macroscopic plasmas. We investigate how amortized solvers can be incorporated with PIC methods for simulations of plasmas. Incorporated into the amortized solver, a neural network predicts a vector space that entails an approximate solution of the PIC system. The network uses only fluid momments and the electric field as input and its output is used to augment the vector space of an iterative linear solver. We find that this approach reduces the average number of required solver iterations by about 25% when simulating electron plasma oscillations. This novel approach may allow to accelerate implicit PIC simulations while retaining all conservation laws and may also be appropriate for multiscale systems. 
Ralph Kube · Randy Churchill 🔗 


Calibrating Electrons and Photons in the CMS ECAL using Graph Neural Networks
(Poster)
The Compact Muon Solenoid (CMS) detector is one of two generalpurpose detectors on the energy frontier of particle physics at the CERN Large Hadron Collider (LHC). Products of protonproton collisions at a center of mass energy of 13 TeV are reconstructed in the CMS detector to probe the standard model of particle physics, and to search for processes beyond the standard model. The development of precision algorithms for this reconstruction is therefore a key objective in optimizing the precision of all physics results at CMS. While machine learning techniques are now prevalent at CMS for these tasks, they have largely relied on highlevel humanengineered input features. However, much of the disruptive impact of machine learning in industry has been realized by bypassing human feature engineering and instead training deep learning algorithms on lowlevel data. We have developed a novel machine learning architecture based on dynamic graph neural networks which allows regression directly on lowlevel detector hits, and we have applied this model to the calibration of electron and photon energies in CMS. In this work, the performance of our new architecture is shown on electrons used in the calibration of the CMS detector, where we obtain an improvement in energy resolution by as much as 10% with respect to the previous stateoftheart reconstruction method. 
Simon Rothman 🔗 


Recalibrating Photometric Redshift Probability Distributions Using Featurespace Regression
(Poster)
SlidesLive Video » Many astrophysical analyses depend on estimates of redshifts (a proxy for distance) determined from photometric (i.e., imaging) data alone. Inaccurate estimates of photometric redshift uncertainties can result in large systematic errors. However, probability distribution outputs from many photometric redshift methods do not follow the frequentist definition of a Probability Density Function (PDF) for redshift  i.e., the fraction of times the true redshift falls between two limits z1 and z2 should be equal to the integral of the PDF between these limits. Previous works have used the global distribution of Probability Integral Transform (PIT) values to recalibrate PDFs, but offsetting inaccuracies in different regions of feature space can conspire to limit the efficacy of the method. We leverage a recently developed regression technique that characterizes the local PIT distribution at any location in feature space to perform a local recalibration of photometric redshift PDFs. Though we focus on an example from astrophysics, our method can produce PDFs which are calibrated at all locations in feature space for any use case. 
Biprateep Dey · Ann Lee · Rafael Izbicki · David Zhao 🔗 


Towards Improved Global River Discharge Prediction in Ungauged Basins Using Machine Learning and Satellite Observations
(Poster)
The recent increase in frequency and severity of natural disasters is a clear indication of an immediate need to address the cascading impacts of climate change. However, climate change cannot be measured directly. In a weather cycle, river discharge is the end result of any hydrologic process, and thus directly measures the effect of two major parameters used to measure impacts of climate change; Temperature and Precipitation. Unlike current methods that are able to infer climate change patterns over a long period of time, river discharge is an effective proxy for measuring the effects of climate change within a short period of time. Unfortunately, current statistical and physicsbased models neither take full advantage of hydrometeorological information encoded in over 100 years of historical hydrologic data nor are they applicable on a global scale. In this work, we train Long Short Term Memory (LSTM) Recurrent Neural Network models on satellite observations and daily discharge from gauged basins. Our models outperform the latest stateoftheart processbased hydrology models with KlingGupta and NashSutcliffe Efficiency scores of 85\% and 81\% respectively in ungauged basins with limited to noexisting data. This will allow accurate predictions in the majority of the global river basins that do not have insitu measurements. 
Aggrey Muhebwa · Jay Taneja 🔗 


Visualization of nonlinear modal structures for threedimensional unsteady fluid flows with customized decoder design
(Poster)
SlidesLive Video » Understanding nonlinear manifolds of scientific data extracted via autoencoder is important to propel practical uses of nonintrusive reducedorder modeling in the community. We here tackle this matter by visualizing nonlinear autoencoder modes with the aid of modedecomposing convolutional neural network autoencoder (MDCNNAE). The MDCNNAE has a customization in the decoder part, which enable us to visualize individual modes extracted through the encoder part. The present demonstration is performed with a threedimensional flow around a square cylinder at Re_D=300, which possesses complex nonlinear vortical phenomena associated with strong nonlinearities. The results are compared with a conventional linear model order reduction method, i.e, principal component analysis (PCA). The reconstructed fields with MDCNNAE hold more energetic information than that with PCA, despite the same number of latent variables. The present results indicate the strong capability of MDCNNAE for efficient lowdimensionalization and data compression of threedimensional flow fields in an interpretable manner. 
Kazuto Hasegawa · Kai Fukami · Koji Fukagata 🔗 


SharpnessAware Minimization for Robust Molecular Dynamics Simulations
(Poster)
SlidesLive Video » Sharpnessaware minimization (SAM) is a novel regularization technique that takes advantage of not only the training error but also the landscape geometry of model parameters to improve model robustness. Although SAM has demonstrated the stateoftheart (SOTA) performance in image classification, its applicability to physical system is yet to be examined. An ideal testbed is neuralnetwork quantum molecular dynamics (NNQMD) simulations that accurately predict material properties, but the stability of their trajectories is severely limited by thermal noise. In this paper, we demonstrate for the first time that SAM regularizer achieves an orderofmagnitude reduction of the outofsample error in potential energy prediction using several SOTA models. Comparing NNQMD datasets with distinct structural characteristics, we found that SAM consistently reduces the outofsample error for a crystal dataset at high temperatures with enhanced thermal noise, thus proving the concept of SAMenhanced robust NNQMD, while no clear trend was observed with an amorphous dataset. Our result suggests a possible correlation between materials structure and model parameter landscape. 
Hikaru Ibayashi · Kenichi Nomura · Pankaj Rajak · Aravind Krishnamoorthy · Aiichiro Nakano 🔗 


A posteriori learning for quasigeostrophic turbulence parametrization
(Poster)
Modeling the subgridscale dynamics of reduced models is a long standing open problem that finds application in ocean, atmosphere and climate predictions where direct numerical simulation (DNS) is impossible. While neural networks (NNs) have already been applied to a range of threedimensional problems with success, the backward energy transfer of twodimensional flows still remains a stability issue for trained models. We show that learning a model jointly with the dynamical solver and a meaningful \textit{a posteriori}based loss function lead to stable and realistic simulations when applied to quasigeostrophic turbulence. 
Hugo Frezat · ronan fablet · Redouane Lguensat 🔗 


Vision transformers and techniques for improving solar wind speed forecasts using solar EUV images
(Poster)
SlidesLive Video » Extremeultraviolet images taken by the Atmospheric Imaging Assembly make it possible to use deep vision techniques in the prediction of solar wind speed  a difficult, highimpact, and unsolved problem. This study uses vision transformers and a set of methodological and modelling improvements to deliver an 11.1% lower RMSE error, and a 17.4% higher prediction correlation compared to the previous state of the art models. Furthermore, our analysis shows that vision transformers combined with our pipeline consistently outperform convolutional alternatives. Additionally, the best vision transformer outperforms the best convolutional model by 1.8% in RMSE and 2.6% in correlation with the ground truth solar wind speed. 
Filip Svoboda · Edward Brown 🔗 


DeepDFT: PhysicsML hybrid method to predict DFT energy using Transformer
(Poster)
SlidesLive Video » Computing the energy of molecules plays a critical role for molecule design. Classical abinitio methods using Density Functional Theory (DFT) often suffers from scalability issues due to its extreme computing cost. A growing number of datadriven neuralnetbased DFT surrogate models have been proposed to address this challenge. After trained on the abinitio reference data, these models significantly accelerate the energy prediction of molecular systems, circumventing numerically solving the Schrödinger equation. However, the performance of these models is often limited to the scope within the training data distribution. It is also challenging to discover physical insights from their prediction due to the lack of interpretability of neural networks. In this paper, we aim to design a physicsML hybrid DFT surrogate model, which is both physically interpretable and generalizable to beyond the training data distribution. To achieve these goals, we propose a physicsdriven approach to fit the energy to an equation combining Coulomb and LennardJones potentials by first predicting their subparameters, then computing the energy product by the equation. Our experimental results show the effectiveness of the proposed approach in its performance, generalizability, and interpretability. 
Youngwoo Cho · Marco Yi · Jaegul Choo · Joonseok Lee · Sookyung Kim 🔗 


Finetuning Vision Transformers for the Prediction of State Variables in Ising Models
(Poster)
SlidesLive Video » Transformers are stateoftheart deep learning models that are composed of stacked attention and pointwise, fully connected layers designed for handling sequential data. Transformers are not only ubiquitous throughout Natural Language Processing (NLP), but also have recently inspired a new wave of Computer Vision (CV) applications research. In this work, a Vision Transformer (ViT) is finetuned to predict the state variables of 2dimensional Ising model simulations. Our experiments show that ViT outperforms stateoftheart Convolutional Neural Networks (CNN) when using a small number of microstate images from the Ising model corresponding to various boundary conditions and temperatures. This work explores the possible of applications of ViT to other simulations and introduces interesting research directions on how attention maps can learn the underlying physics governing different phenomena. 
Onur Kara · Arijit Sehanobish · HECTOR CORZO 🔗 


Physicsinformed neural network for inversely predicting effective electric permittivities of metamaterials
(Poster)
We apply a physicsinformed neural network framework for inversely retrieving the effective material parameters of a twodimensional metasurface from its scattered field(s). We show that by employing a loss function based on the Helmholtz wave equation, we can model the performance of a metamaterial discshaped structure and splitring resonator with great promise and demonstrate the dependance of resonant behavior on the homogenized electric permittivity distribution profile generated by our network. 
Parama Pal · Prajith P 🔗 


Learning Size and Shape of CalabiYau Spaces
(Poster)
SlidesLive Video » We present a new machine learning library for computing metrics of string compactification spaces. We benchmark the performance on MonteCarlo sampled integrals against previous numerical approximations and find that our neural networks are more sample and computationefficient. We are the first to provide the possibility to compute these metrics for arbitrary, userspecified shape and size parameters of the compact space and observe a linear relation between optimization of the partial differential equation we are training against and vanishing Ricci curvature. 
Robin Schneider 🔗 


Learning Full Configuration Interaction Electron Correlations with Deep Learning
(Poster)
SlidesLive Video » In this report, we present a deep learning framework termed the Electron Correlation Potential Neural Network (eCPNN) that can learn succinct and compact potential functions. These functions can effectively describe the complex instantaneous spatial correlations among electrons in manyelectron atoms. The eCPNN was trained in an unsupervised manner with limited information from Full Configuration Interaction (FCI) oneelectron density functions within predefined limits of accuracy. Using the effective correlation potential functions generated by eCPNN, we can predict the total energies of each of the studied atomic systems with a remarkable accuracy when compared to FCI energies. 
HECTOR CORZO · Arijit Sehanobish · Onur Kara 🔗 


A deep ensemble approach to Xray polarimetry
(Poster)
SlidesLive Video » Xray polarimetry will soon open a new window on the high energy universe with the launch of NASA's Imaging Xray Polarimetry Explorer (IXPE). Polarimeters are currently limited by their track reconstruction algorithms, which use linear estimators and do not consider individual event quality. We present a modern deep learning method for maximizing the sensitivity of Xray telescopic observations with imaging polarimeters, with a focus on the gas pixel detectors (GPDs) to be flown on IXPE. We use a weighted maximum likelihood combination of predictions from a deep ensemble of ResNets, trained on Monte Carlo event simulations. We derive and apply the optimal event weighting for maximizing the signaltonoise ratio (SNR) in track reconstruction algorithms. For typical powerlaw source spectra, our method improves on the current state of the art, providing a ~40% decrease in required exposure times. 
Lawrence Peirson 🔗 


Learning Discrete Neural Reaction Class to Improve Retrosynthesis Prediction
(Poster)
Computeraided retrosynthesis accelerate and innovate the process of molecule and material design, allowing the discovery of new pathways and automating part of the overall development process for drugs and materials. Current machinelearning methods applied to retrosynthesis are limited by their lack of control when generating singlestep reactions as they rely on sampling or beam search algorithm. In this work, we apply vector quantized representation learning [1] to learn reaction classes along with retrosynthetic predictions. We represent each reaction class with a vector allowing us to condition the retrosynthetic prediction. We show that learning reaction classes increases control as well as generating more diverse predictions than a baseline model. Our results are a significant step forward in the development of multistep retrosynthesis prediction. 
Théophile Gaudin · Animesh Garg · Alan AspuruGuzik 🔗 


Classical variational simulation of the Quantum Approximate Optimization Algorithm
(Poster)
We introduce a method to simulate parametrized quantum circuits, an architecture behind many practical algorithms on nearterm hardware, focusing on the Quantum Approximate Optimization Algorithm (QAOA). A neuralnetwork parametrization of the manyqubit wave function is used, reaching 54 qubits at 4 QAOA layers, approximately implementing 324 RZZ gates and 216 RX gates without requiring largescale computational resources. our approach can be used to provide accurate QAOA simulations at previously unexplored parameter values and to benchmark the next generation of experiments in the Noisy IntermediateScale Quantum (NISQ) era. 
Matija Medvidović · Giuseppe Carleo 🔗 


Deep Surrogate for Direct Time Fluid Dynamics
(Poster)
Computational Fluid Dynamics solvers have benefited from strong developments for decades, being critical for many scientific and industrial applications. The downside of their great accuracy is a requirement for tremendous computational resources. In this short article, we present our ongoing work to design a datadriven deep surrogate: a neural network that is trained to provide a quality solution to the NavierStokes equations for a given domain, initial and boundary conditions. The resulting surrogate is expected to substitute traditional solvers in a limited range of input conditions, and enable interactive parameter exploration, sensibility analysis, and digital twins. Some approaches to build datadriven surrogates mimic the solver iterative process, being trained to compute the fluid transition from a time step t to t+1. Other surrogates are trained to directly produce a time step t, and are called "direct time". Surrogates also differ in their approach to space discretization. If the mesh is a regular grid, CNNs can be used. Irregular meshes or particlebased approaches are more challenging, and can be addressed through some variations of graph neural networks (GNN). Our contribution is a novel direct time GNN architecture for irregular meshes. It consists of a succession of graphs of increasing size connected by spline convolutions. Early experiments with the Von Karman’s vortex street benchmark show that our architecture achieves small generalization errors (RMSE at about 10^3) and is not subject to error accumulation along the trajectory. 
Lucas Meyer · Bruno Raffin 🔗 


Normalizing Flows for Random Fields in Cosmology
(Poster)
We study the use of normalizing flows to represent the fieldlevel probability density distribution of random fields in cosmology such as the matter and radiation distribution. We evaluate the performance of the real NVP flow for sampling of Gaussian and nearGaussian random fields, and Nbody simulations, and check the quality of samples with different statistics such as power spectrum and bispectrum estimators. We explore aspects of these flows that are specific to cosmology, such as flowing from a physical prior distribution and evaluating the density estimation results in the analytically tractable correlated Gaussian case. 
Adam Rouhiainen 🔗 


Machine Learning and Dynamical Models for Subseasonal Climate Forecasting
(Poster)
Subseasonal forecasting (SSF) is the prediction of key climate variables such as temperature and precipitation on a 2week to 2month time horizon. Skillful SSF would have substantial societal value in areas such as agricultural productivity, water resource management, and emergency planning for droughts and wildfires. Despite its societal importance, SSF has stayed a challenging problem and mainly relies on physicsbased dynamical models. Meanwhile, recent studies have shown the potential of machine learning (ML) models to advance SSF. In this paper, we show that suitably incorporating dynamical model forecasts as inputs to ML models can substantially improve their forecasting performance. The SSF dataset and codebase constructed for the work will be made available along with the paper. 
Sijie He · Xinyan Li · Laurie Trenary · Benjamin Cash · Timothy DelSole · Arindam Banerjee 🔗 


Multiway Ensemble Kalman Filter
(Poster)
SlidesLive Video » In this work, we study the emergence of sparsity and multiway structures in secondorder characterizations of dynamical processes governed by partial differential equations (PDEs). We consider several stateoftheart multiway covariance and inverse covariance (precision) matrix estimators and examine their pros and cons in terms of accuracy and interpretability in the context of physicsdriven forecasting using the ensemble Kalman filter (EnKF). In particular, we show that multiway data generated from the Poisson and the convectiondiffusion types of PDEs can be accurately tracked via EnKF when integrated with appropriate covariance and precision matrix estimators. 
Wayne Wang · Alfred Hero 🔗 


Neural quantum states for supersymmetric quantum gauge theories
(Poster)

Enrico Rinaldi 🔗 


A QuasiUniversal Neural Network to Model Structure Formation in the Universe
(Poster)
The largescale structure of the Universe is the direct consequence of its evolution over billions of years. The observations of this largescale structure in terms of galaxy redshift surveys contain valuable cosmological information and in order to extract that information, we need to compare these observations to corresponding theory predictions from cosmological simulations, whose generation in itself is a very computationally intensive feat. This work uses deep convolutional neural networks to simulate the largescale structure of the Universe and generate a typical cosmological simulation orders of magnitude faster than the standard Nbody simulations within an accuracy of ~1% on the most common cosmological summary statistics. The most important feature of our model is that it extrapolates extremely well on universes with entirely different cosmologies than the one it has been trained on. The use of such an approach will be particularly useful in the near future to compare theory with predictions, to generate mock galaxy catalogs, to compute covariance matrices, and to optimize observational strategies. 
Neerav Kaushal · Elena Giusarma · Mauricio Reyes Hurtado 🔗 


Learning governing equations of interacting particle systems using Gaussian process regression
(Poster)
SlidesLive Video » Interacting particle or agent systems that display a rich variety of collection motions are ubiquitous in science and engineering. The fundamental and challenging goals are to infer individual interaction rules that yield collective behaviors and establish the governing equations. In this paper, we study the datadriven discovery of secondorder interacting particle systems with distancebased interaction laws, which are known to have the capability to reproduce a rich variety of collective patterns. We propose a learning approach that models the latent interaction function as a Gaussian process, which can simultaneously fulfill two inference goals: one is the nonparametric inference of interaction function with the pointwise uncertainty quantification, and the other one is the inference of unknown parameters in the noncollective forces of the system. We test the learning approach on Dorsogma model and numerical results demonstrate the effectiveness. 
Sui Tang 🔗 


Marrying the benefits of Automatic and Numerical Differentiation in PhysicsInformed Neural Network
(Poster)
In this study, a novel physicsinformed neural network (PINN) is proposed to allow efficient training with improved accuracy. PINNs typically constrain their training loss function with differential equations to ensure outputs obey underlying physics. These differential operators are typically computed via automatic differentiation (AD), but this can fail with insufficient collocation points. Hence, the idea of coupling both AD and numerical differentiation (ND) is employed. The proposed coupledautomaticnumerical differentiation scheme (canPINN) strongly links collocation points, thus enabling efficient training while being more accurate than simply using ND. As a demonstration, two instantiations of canPINN were derived for the incompressible NavierStokes equations and applied to modeling of liddriven flow in a cavity. Results show that canPINNs can achieve very good accuracy even when the corresponding ADbased PINN fails. 
PaoHsiung Chiu · Chin Chun Ooi · Yew Soon Ong 🔗 


GSpaNet: Generalized Permutationless Set Assignment for Particle Physics using Symmetry Preserving Attention
(Poster)
We introduce a novel method for constructing symmetrypreserving attention networks which reflect the natural invariances of the jetparton assignment problem to efficiently find assignments without evaluating all permutations. This general approach is applicable to arbitrarily complex configurations and significantly outperforms current methods, improving reconstruction efficiency between 19%  35% on benchmark problems while decreasing inference time by two to five orders of magnitude, making many important and previously intractable cases tractable. 
Alex Shmakov · Shihchieh Hsu · Pierre Baldi 🔗 


Critical parametric quantum sensing with machine learning
(Poster)
Open quantum systems can undergo dissipative phase transitions, and their critical behavior can be used to enhance, e.g., the fidelity of superconducting qubit readout measurements, a central problem toward the creation of reliable quantum hardware. For example, a recently introduced measurement protocol, named ``critical parametric quantum sensing'', uses the parametric (twophoton driven) Kerr resonator's drivendissipative phase transition to reach singlequbit detection fidelity of 99.9\%. These classification algorithms are applied to the time series data of weak quantum measurements (homodyne detection) of a circuitQED implementation of the Kerr resonator coupled to a superconducting qubit. This demonstrates how machine learning methods enable a fast and reliable measurement protocol in critical open quantum systems. 
Enrico Rinaldi 🔗 


Scaling Up Machine Learning For Quantum Field Theory with Equivariant Continuous Flows
(Poster)
SlidesLive Video » We propose a continuous normalizing flow for sampling from the highdimensional probability distributions of Quantum Field Theories in Physics. In contrast to the deep architectures used so far for this task, our proposal is based on a shallow design and incorporates the symmetries of the problem. We test our model on the ϕ⁴ theory, showing that it systematically outperforms a realNVP baseline in sampling efficiency, with the difference between the two increasing for larger lattices. On the largest lattice we consider, of size 32 x 32, we improve a key metric, the effective sample size, from 1% to 66% w.r.t. the realNVP baseline. 
Pim de Haan · Roberto Bondesan 🔗 


Equivariant Transformers for Neural Network based Molecular Potentials
(Poster)
The prediction of quantum mechanical properties is historically plagued by a tradeoff between accuracy and speed. Machine learning potentials have previously shown great success in this domain, reaching increasingly better accuracy while maintaining computational efficiency comparable with classical force fields. In this work we propose a novel equivariant Transformer architecture, outperforming stateoftheart on MD17 and ANI1. Through an extensive attention weight analysis, we gain valuable insights into the black box predictor and show differences in the learned representation of conformers versus conformations sampled from molecular dynamics or normal modes. Furthermore, we highlight the importance of datasets including offequilibrium conformations for the evaluation of molecular potentials. 
Philipp Thölke · Gianni De Fabritiis 🔗 


Fast Approximate Model for the 3D Matter Power Spectrum
(Poster)
Many Bayesian inference problems in cosmology involve complex models. Despite the fact that these models have been meticulously designed, they can lead to intractable likelihood and each forward simulation itself can be computationally expensive, thus making the inverse problem of learning the model parameters a challenging task. In this paper, we develop an approximate model for the 3D matter power spectrum, P(k,z), which is a central quantity in a weak lensing analysis. An important output of this approximate model, often referred to as surrogate model or emulator, are the first and second derivatives with respect to the input cosmological parameters. Without the emulator, the calculation of the derivatives requires multiple calls of the simulator, that is, the accurate Boltzmann solver, CLASS. We illustrate the application of the emulator in the calculation of different weak lensing and intrinsic alignment power spectra and we also demonstrate its performance on a toy simulated weak lensing dataset. 
Arrykrishna Mootoovaloo 🔗 


3D Pretraining improves GNNs for Molecular Property Prediction
(Poster)
SlidesLive Video » Molecular property prediction is one of the fastestgrowing applications of deep learning with critical realworld impacts. Including 3D molecular structure as input to learned models improves their predictions for many molecular properties. However, this information is infeasible to compute at the scale required by most realworld applications. We propose pretraining a model to understand the geometry of molecules given only their 2D molecular graph. Using methods from selfsupervised learning, we maximize the mutual information between a 3D summary vector and the representations of a Graph Neural Network (GNN) such that they contain latent 3D information. During finetuning on molecules with unknown geometry, the GNN still generates implicit 3D information and can use it to inform downstream tasks. We show that 3D pretraining provides significant improvements for a wide range of molecular properties, such as a 22% average MAE reduction on eight quantum mechanical properties. Crucially, the learned representations can be effectively transferred between datasets with vastly different molecules. 
Hannes Stärk · Gabriele Corso · Christian Dallago · Stephan Günnemann · Pietro Lió 🔗 


Learning the exchangecorrelation functional from nature with differentiable density functional theory
(Poster)
Improving the predictive capability of molecular properties in abinitio simulations is essential for advanced material discovery. Despite recent progress making use of machine learning, utilizing deep neural networks to improve quantum chemistry modelling remains severely limited by the scarcity and heterogeneity of appropriate experimental data. Here we show how training a neural network to replace the exchangecorrelation functional within a fullydifferentiable threedimensional KohnSham density functional theory (DFT) simulation can greatly improve its accuracy and generalizability. Using only eight experimental data points on diatomic molecules, our trained exchangecorrelation networks provided improved predictions of atomization energies across a collection of 104 molecules containing new bonds and atoms that are not present in the training. 
Muhammad Firmansyah · Sam Vinko 🔗 


RBSRICNN: Raw Burst SuperResolution through Iterative Convolutional Neural Network
(Poster)
SlidesLive Video » Modern digital cameras and smartphones mostly rely on image signal processing (ISP) pipelines to produce realistic colored RGB images. However, compared to DSLR cameras, lowquality images are usually obtained in many portable mobile devices with compact camera sensors due to their physical limitations. The lowquality images have multiple degradations \ie subpixel shift due to camera motion, mosaick patterns due to camera color filter array, lowresolution due to smaller camera sensors, and the rest information are corrupted by the noise. Such degradations limit the performance of current Single Image Superresolution (SISR) methods in recovering highresolution (HR) image details from a single lowresolution (LR) image. In this work, we propose a Raw Burst SuperResolution Iterative Convolutional Neural Network (RBSRICNN) that follows the burst photography pipeline as a whole by a forward (physical) model. The proposed Burst SR scheme solves the problem with classical image regularization, convex optimization, and deep learning techniques, compared to existing blackbox datadriven methods. The proposed network produces the final output by an iterative refinement of the intermediate SR estimates. We demonstrate the effectiveness of our proposed approach in quantitative and qualitative experiments that generalize robustly to real LR burst inputs with onl synthetic burst data available for training. 
Rao Umer · CHRISTIAN MICHELONI 🔗 


Classifying Anomalies THrough Outer Density Estimation (CATHODE)
(Poster)
SlidesLive Video » We propose a new modelagnostic search strategy for hints of new fundamental forces motivated by applications in particle physics. It is based on a novel application of neural density estimation to anomaly detection. Our approach, which we call Classifying Anomalies THrough Outer Density Estimation (CATHODE), assumes potential signal events cluster in phase space in a signal region. However, backgrounds due to known processes are also present in the signal region and too large to directly detect such a signal. By training a conditional density estimator on a collection of additional features outside the signal region, interpolating it into the signal region, and sampling from it, we produce a collection of events that follow the background model. We can then train a classifier to distinguish the data from the events sampled from the background model, thereby approaching the optimal anomaly detector. Using the public LHC Olympics R&D data set, we demonstrate that CATHODE nearly saturates the best possible performance, and significantly outperforms other approaches in this bump hunt paradigm. 
Joshua Isaacson · Gregor Kasieczka · Benjamin Nachman · David Shih · Manuel Sommerhalder 🔗 


Unsupervised topological learning approach of crystal nucleation in pure Tantalum
(Poster)
Nucleation phenomena commonly observed in our every day life are of fundamental, technological and societal importance in many areas, but some of their most intimate mechanisms remain however to be unravelled. Crystal nucleation, the early stages where the liquidtosolid transition occurs upon undercooling, initiates at the atomic level on nanometre length and subpicoseconds time scales and involves complex multidimensional mechanisms with local symmetry breaking that can hardly be observed experimentally in the very details. To reveal their structural features in simulations without a priori, an unsupervised learning approach founded on topological descriptors loaned from persistent homology concepts is proposed. Applied here to a monatomic metal, namely Tantalum (Ta), it shows that both translational and orientational ordering always come into play simultaneously when homogeneous nucleation starts in regions with low fivefold symmetry. 
Emilie Devijver · Rémi Molinier 🔗 


DeepBO: Deep NeuralNetwork Bayesian Optimization of Polaritonic Metasurfaces in Continuous Space
(Poster)
SlidesLive Video » Thermophotovoltaics (TPVs) rely on selective thermal emitters to tailor the blackbody radiation at high temperatures into bandmatching emission for photovoltaic cells, resulting in powerconversion efficiencies surpassing the ShockleyQueisser limit. The selectivity of the thermal emitter must cover three orders of magnitude range of wavelengths, spreading from visible to the far infrared, which requires the superposition of multiple transformation theories of optics and degrees of freedom anisotropic geometries. It is extremely challenging to realize such highdimensional complex metasurface design using conventional computational photonics. Here we develop a deep neural networkbased Bayesian optimization (DeepBO) framework to screen a 16dimensional design space of 10^43 candidates, and realize a recordhigh spectral efficiency of 69% for the TPV emitter. We show that the neural network combined with Bayesian linear regression is an efficient and robust surrogate model which scales linearly with the size of data. We also reveal the underlying physical mechanisms of the geometric design of the TPV emitters using primary component analysis (PCA). We anticipate the DeepBO framework is a useful tool for dataintensive complex geometric design for photonics research community. 
Zihan Zhang · Kehang Cui · Jintao Chen 🔗 


A datadriven wall model for the prediction of turbulent flow separation over periodic hills
(Poster)
Direct Numerical Simulations and even wall resolved Large Eddy Simulations remain computationally intractable for full blade span computations when going to large Reynolds numbers. The cost of representing turbulence near the wall motivates the development of wallmodeled LES. The present work proposes the use of Deep Neural Nets (DNN) to link the wall shear stress components to volume data extracted at multiple wallnormal distances $h_{wm}$ and wallparallel locations. The developed datadriven wall model focuses on the prediction of separation, which is a frequently observed phenomenon in modern lowpressure turbines. The model is trained using a highfidelity database of the twodimensional periodic hill flow, which exhibits separation and is affordable to compute on modern clusters.

Margaux Boxho 🔗 


Extending turbulence model uncertainty quantification using machine learning
(Poster)
SlidesLive Video » In order to achieve a more virtual design and certification process of jet engines in aviation industry, the uncertainty bounds for computational fluid dynamics have to be known. This work shows the application of a machine learning methodology to quantify the epistemic uncertainties of turbulence models. The underlying method in order to estimate the uncertainty bounds is based on an eigenspace perturbations of the Reynolds stress tensor in combination with random forests. 
Marcel Matha 🔗 


Missing Data Imputation for Galaxy Redshift Estimation
(Poster)
SlidesLive Video » Astronomical data is full of holes. While there are many reasons for this missing data, the data can be randomly missing, caused by things like data corruptions or unfavourable observing conditions. We test some simple data imputation methods (Mean, Median, Minimum, Maximum and kNearest Neighbours (kNN)), as well as two more complex methods (Multivariate Imputation by using Chained Equation (MICE) and Generative Adversarial Imputation Network (GAIN)) against data where increasing amounts are randomly set to missing. We then use the imputed datasets to estimate the redshift of the galaxies, using the kNN and Random Forest ML techniques. We find that the MICE algorithm provides the lowest Root Mean Square Error and consequently the lowest prediction error, with the GAIN algorithm the next best. 
Kieran Luken 🔗 


Arbitrary Marginal Neural Ratio Estimation for Simulationbased Inference
(Poster)
In many areas of science, complex phenomena are modeled by stochastic parametric simulators, often featuring highdimensional parameter spaces and intractable likelihoods. In this context, performing Bayesian inference can be challenging. In this work, we present a novel method that enables amortized inference over arbitrary subsets of the parameters, without resorting to numerical integration, which makes interpretation of the posterior more convenient. Our method is efficient and can be implemented with arbitrary neural network architectures. We demonstrate the applicability of the method on parameter inference of binary black hole systems from gravitational waves observations. 
François Rozet · Gilles Louppe 🔗 


Amortized Bayesian inference of gravitational waves with normalizing flows
(Poster)
Gravitational waves (GWs) detected by the LIGO and Virgo observatories encode descriptions of their astrophysical progenitors. To characterize these systems, physical GW signal models are inverted using Bayesian inference coupled with stochastic samplersa task that can take O(day) for a typical binary black hole. Several recent efforts have attempted to speed this up by using normalizing flows to estimate the posterior distribution conditioned on the observed data. In this study, we further develop these techniques to achieve results nearly indistinguishable from standard samplers when evaluated on real GW data, with inference times of one minute per event. This is enabled by (i) incorporating detector nonstationarity from event to event by conditioning on a summary of the noise characteristics, (ii) using an embedding network adapted to GW signals to compress data, and (iii) adopting a new inference algorithm that makes use of underlying physical equivariances. 
Maximilian Dax · Stephen Green · Jakob Macke · Bernhard Schölkopf 🔗 


Weight Pruning and Uncertainty in Radio Galaxy Classification
(Poster)
SlidesLive Video » In this work we use variational inference to quantify the degree of epistemic uncertainty in model predictions of radio galaxy classification and show that the level of model posterior variance for individual test samples is correlated with measures of human uncertainty when labelling radio galaxies. Using the posterior distributions for individual weights, we show that signaltonoise ratio (SNR) ranking allows pruning of the fullyconnected layers to the level of 40% without significant loss of performance, and that this pruning reduces the predictive uncertainty in the model. Finally we show that, like other work in this field, we experience a cold posterior effect. We examine whether the inclusion of an additional variance term in the loss can compensate for this effect, but find that it does not make a significant difference. We interpret this as the cold posterior effect being due to the overly effective curation of our training sample rather than model misspecification and raise this as a potential issue for Bayesian approaches to radio galaxy classification in future. 
Anna Scaife 🔗 


Automatic differentiation approach for reconstructing spectral functions with neural networks
(Poster)
SlidesLive Video » Reconstructing spectral functions from Euclidean Green’s functions is an important inverse problem in physics. The prior knowledge for specific physical systems routinely offers essential regularization schemes for solving the illposed problem approximately. Aiming at this point, we propose an automatic differentiation framework as a generic tool for the reconstruction from observable data. We represent the spectra by neural networks and set chisquare as loss function to optimize the parameters with backward automatic differentiation unsupervisedly. In the training process, there is no explicit physical prior embedding into neural networks except the positivedefinite form. The reconstruction accuracy is assessed through Kullback–Leibler(KL) divergence and mean square error(MSE) at multiple noise levels. It should be noted that the automatic differential framework and the freedom of introducing regularization are inherent advantages of the present approach and may lead to improvements of solving inverse problem in the future. 
Lingxiao Wang · Kai Zhou 🔗 


Characterizing γray maps of the Galactic Center with neural density estimation
(Poster)
Machine learning methods have enabled new ways of performing inference on highdimensional datasets modeled using complex simulations. We leverage recent advancements in simulationbased inference in order to characterize the contribution of various modeled components to γray data of the Galactic Center recorded by the Fermi satellite. A specific goal here is to differentiate "smooth" emission, as expected for a dark matter origin, from more "clumpy" emission expected for a population of relatively bright, unresolved astrophysical point sources. Compared to traditional techniques based on the statistical distribution of photon counts, our method based on density estimation using normalizing flows is able to utilize more of the information contained in a given model of the Galactic Center emission, and in particular can perform posterior parameter estimation while accounting for pixeltopixel spatial correlations in the γray map. 
Siddharth MishraSharma · Kyle Cranmer 🔗 


Bayesian Stokes inversion with Normalizing flows
(Poster)
SlidesLive Video » Stokes inversion techniques are very powerful methods for obtaining information on the thermodynamic and magnetic properties of solar and stellar atmospheres. Most of the existing inversion codes are designed for finding the optimum solution to the nonlinear inverse problem. However, to obtain the location of potentially multimodal solutions, degeneracies, and the uncertainties of each parameter from the inversions, algorithms such as Markov chain Monte Carlo require to evaluate the model thousand of times. Variational methods are a quick alternative by approximating the posterior distribution by a parametrized distribution. In this study, we explore a highly flexible variational method, known as normalizing flows, to return Bayesian posterior probabilities for solar observations. We illustrate the ability of the method using a simple MilneEddington model and a complex nonLTE inversion. The training procedure need only be performed once for a given prior parameter space and the resulting network can then generate samples describing the posterior distribution several orders of magnitude faster than existing techniques. 
Carlos Díaz Baso 🔗 


Generative models for hadron shower simulation
(Poster)
SlidesLive Video » Simulations provide the crucial link between theoretical descriptions and experimental observations in the physical sciences. In experimental particle physics, a complex ecosystem of tools exists to describe fundamental processes or the interactions of particles with detectors. The high computational cost associated with producing precise simulations in sufficient quantities  e.g. for the upcoming datataking phase of the Large Hadron Collider (LHC) or future colliders  motivates research into more computationally efficient solutions. Using generative machine learning models to amplify the statistics of a given dataset is an especially promising direction. However, the simulation of realistic showers in a highly granular detector remains a daunting problem due to the large number of cells, values spanning many orders of magnitude, and the overall sparsity of data. This contribution advances the state of the art in two key directions: Firstly, we present a precise generative model for the fast simulation of hadronic showers in a highly granular hadronic calorimeter. Secondly, we compare the achieved simulation quality before and after interfacing with a socalled particleflowbased reconstruction algorithm. Together, these bring generative models one step closer to practical applications. 
Sascha Diefenbacher · Erik Buhmann · Engin Eren · Frank Gaede · Daniel C. Hundhausen · Gregor Kasieczka · William Korcari · Katja Krueger · Peter McKeown · Lennart Rustige 🔗 


Detecting Spatiotemporal Lightning Patterns: An Unsupervised GraphBased Approach
(Poster)
Accurate measures of lightning activity can be used to predict extreme weather events in advance, saving lives and property. However, the current handcrafted filtering algorithm for identifying true lightning events from data captured by the GLM onboard NOAA's GOESR satellites is only 70% accurate, with a 5% false alarm rate. Given the large volume and high temporal resolution, this work applies unsupervised learning techniques in an effort to detect lightning within raw data signals. We present a novel data processing pipeline for the GLM Level 0 products and case study comparison of two approaches to dimensionality reduction and clustering to sort the data by similar patterns. These clusters could then be labeled by a domain expert to accurately distinguish between noise and true lightning events. We demonstrate that autoencoders with graph convolution layers are able to learn a translationally invariant representation of the dataset which allows for kmeans clustering to group samples that have similar spatiotemporal patterns together. This work is a first step towards building a machine learning pipeline for improving false event filtering to identify lightning and enhance predictive abilities in the face of increasingly frequent extreme weather events. 
Emma Benjaminson · Juan Emmanuel Johnson · Milad Memarzadeh · Nadia Ahmed 🔗 


Uncertainty Aware Learning for High Energy Physics With A Cautionary Tale
(Poster)
Machine learning tools provide a significant improvement in sensitivity over traditional analyses by exploiting subtle patterns in highdimensional feature spaces. These subtle patterns may not be wellmodeled by the simulations used for training machine learning methods, resulting in an enhanced sensitivity to systematic uncertainties. Contrary to the traditional wisdom of constructing an analysis strategy that is invariant to systematic uncertainties, we study the use of a classifier that is fully aware of uncertainties and their corresponding nuisance parameters. We show on two datasets that this dependence can actually enhance the sensitivity to parameters of interest compared to baseline approaches. Finally, we provide a cautionary example for situations where uncertainty mitigating techniques may serve only to hide the true uncertainties. 
Aishik Ghosh · Benjamin Nachman 🔗 


Error Analysis of Kilonova Surrogate Models
(Poster)
Studies of kilonovae, optical counterparts of binary neutron star mergers, rely on accurate simulation models. The most accurate simulations are computationally expensive; surrogate modelling provides a route to emulate the original simulations and therefore use them for statistical inference. We present a new implementation of surrogate construction using conditional variational autoencoders (cVAE) and discuss the challenges of this method. We additionally present model evaluation methods tailored to the scientific analyses of this field. We find that the cVAE surrogate produces errors well within a standard assumed systematic modelling uncertainty. We also report the results of our parameter inference study, finding our constrained parameters to be comparable with previously published results. 
Kamile Lukosiute · Brian Nord 🔗 


Can semisupervised learning reduce the amount of manual labelling required for effective radio galaxy morphology classification?
(Poster)
In this work, we examine the robustness of stateoftheart semisupervised learning (SSL) algorithms when applied to morphological classification in modern radio astronomy. We test whether SSL can achieve performance comparable to the current supervised state of the art when using many fewer labelled data points and if these results generalise to using truly unlabelled data. We find that although SSL provides additional regularisation, its performance degrades rapidly when using very few labels, and that using truly unlabelled data leads to a significant drop in performance. 
Inigo V Slijepcevic · Anna Scaife 🔗 


A simple equivariant machine learning method for dynamics based on scalars
(Poster)
SlidesLive Video » Physical systems obey strict symmetry principles. We expect that machine learning methods that intrinsically respect these symmetries should perform better than those that do not. In this work we implement a principled model based on invariant scalars, and release opensource code. We apply this Scalars method to a simple chaotic dynamical system, the springy double pendulum. We show that the Scalars method outperforms stateoftheart approaches for learning the properties of physical systems with symmetries, both in terms of accuracy and speed. Because the method incorporates the fundamental symmetries, we expect it to generalize to different settings, such as changes in the force laws in the system. 
Weichi Yao · Kate StoreyFisher · David W Hogg · Soledad Villar 🔗 


Physicsenhanced Neural Networks in the Small Data Regime
(Poster)
SlidesLive Video » Identifying the dynamics of physical systems requires a machine learning model that can assimilate observational data, but also incorporate the laws of physics. Neural Networks based on physical principles such as the Hamiltonian or Lagrangian NNs have recently shown promising results in generating extrapolative predictions and accurately representing the system's dynamics. We show that by additionally considering the actual energy level as a regularization term during training and thus using physical information as inductive bias, the results can be further improved. Especially in the case where only small amounts of data are available, these improvements can significantly enhance the predictive capability. We apply the proposed regularization term to a Hamiltonian Neural Network (HNN) and Constrained Hamiltonian Neural Network (CHHN) for a single and double pendulum, generate predictions under unseen initial conditions and report significant gains in predictive accuracy. 
Sebastian Kaltenbach · psk S Koutsourelakis 🔗 


Approximate Latent Force Model Inference
(Poster)
Physicallyinspired latent force models offer an interpretable alternative to purely data driven tools for inference in dynamical systems. They carry the structure of differential equations and the flexibility of Gaussian processes, yielding interpretable parameters and dynamicsimposed latent functions. However, the existing inference techniques rely on the exact computation of posterior kernels which are seldom available in analytical form. Applications relevant to practitioners, such as diffusion equations, are hence intractable. We overcome these computational problems by proposing a variational solution to a general class of nonlinear and parabolic partial latent force models. We demonstrate the efficacy and flexibility of our framework by achieving competitive performance on several tasks. 
Jacob Moss · Felix Opolka · Pietro Lió 🔗 


Rethinking the modeling of the instrumental response of telescopes with a differentiable optical model
(Poster)
We propose a paradigm shift in the datadriven modeling of the instrumental response field of telescopes. By adding a differentiable optical forward model into the modeling framework, we change the datadriven modeling space from the pixels to the wavefront. This allows to transfer a great deal of complexity from the instrumental response into the forward model while being able to adapt to the observations, remaining datadriven. Our framework allows to build powerful models that are physically motivated, interpretable, and that do not require special calibration data. We show that for a realistic setting of the Euclid space telescope, this framework represents a real performance breakthrough with reconstruction errors decreasing 5 times at observation resolution and more than 10 times for a 3x superresolution. We successfully model chromatic variations of the instrument's response only using noisy broadband infocus observations. 
Tobias Liaudat · JeanLuc Starck 🔗 


Inorganic Synthesis Reaction Condition Prediction with Generative Machine Learning
(Poster)
Datadriven synthesis planning with machine learning is a key step in the design and discovery of novel inorganic compounds with desirable properties. Inorganic materials synthesis is often guided by chemists' prior knowledge and experience, built upon experimental trialanderror that is both time and resource consuming. Recent developments in natural language processing (NLP) have enabled largescale text mining of scientific literature, providing open source databases of synthesis information of synthesized compounds, material precursors, and reaction conditions (temperatures, times). In this work, we employ a conditional variational autoencoder (CVAE) to predict suitable inorganic reaction conditions for the crucial inorganic synthesis steps of calcination and sintering. We find that the CVAE model is capable of learning subtle differences in target material composition, precursor compound identities, and choice of synthesis route (solidstate, solgel) that are present in the inorganic synthesis space. Moreover, the CVAE can generalize well to unseen chemical entities and shows promise for predicting reaction conditions for previously unsynthesized compounds of interest. 
Christopher Karpovich 🔗 


Quantum Machine Learning for Radio Astronomy
(Poster)
SlidesLive Video » In this work we introduce a novel approach to the pulsar classification problem in timedomain radio astronomy using a Born machine, often referred to as a quantum neural network. Using a singlequbit architecture, we show that the pulsar classification problem maps well to the Bloch sphere and that comparable accuracies to more classical machine learning approaches are achievable. We introduce a novel singlequbit encoding for the pulsar data used in this work and show that this performs comparably to a multiqubit QAOA encoding. 
Mo Kordzanganeh · Anna Scaife 🔗 


Proximal Biasing for Bayesian Optimization and Characterization of Physical Systems
(Poster)
SlidesLive Video » Bayesian techniques have been shown to be extremely efficient in optimizing expensive to evaluate black box functions, in both computational (offline design) and physical (online experimental control) contexts. Optimizing physical systems often comes with extra challenges due to costs associated with changing parameters in real life experimentation, such as measurement location in physical space or mechanical/electrical actuation. In these cases, the cost of changing a given input parameter is often proportional to the magnitude of the change, for example the time cost associated with the distance a physical actuator must travel. To minimize these costs, optimization algorithms can simply limit the maximum distance travelled in input space during each step. However, hard restrictions on the travel distance inhibits global exploration advantages normally afforded by Bayesian optimization algorithms. In this work, we describe a proximal weighting term that can bias acquisition functions towards localized exploration, while still allowing for large travel distances if far away points are predicted to be valuable for observation. We describe a use case where this weighting is used to minimize the uncertainty of a particle accelerator Bayesian model in a smooth manner, which in turn, minimizes temporal costs associated with changing input parameters. 
Ryan Roussel · Auralee Edelen 🔗 


Learning the solar latent space: sigmavariational autoencoders for multiple channel solar imaging
(Poster)
This study uses a sigmavariational autoencoder to learn a latent space of the Sun using the 12 channels taken by Atmospheric Imaging Assembly (AIA) and the Helioseismic and Magnetic Imager (HMI) instruments onboard the NASA Solar Dynamics Observatory. The model is able to significantly compress the large image dataset to 0.19% of its original size while still proficiently reconstructing the original images. As a downstream task making use of the learned representation, this study demonstrates the of use the solar latent space as an input to improve the forecasts of the F30 solar radio flux index, compared to an offtheshelf pretrained ResNet feature extractor. Finally, the developed models can be used to generate realistic synthetic solar images by sampling from the learned latent space. 
Edward Brown · Christopher Bridges · Bernard Benson · Atilim Gunes Baydin 🔗 


Digital Twin Earth  Coasts: Developing a fast and physicsinformed surrogate model for coastal floods via neural operators
(Poster)
Developing fast and accurate surrogates for physicsbased coastal and ocean models is an urgent need due to the coastal flood risk under accelerating sea level rise, and the computational expense of deterministic numerical models. For this purpose, we develop the first digital twin of Earth coastlines with new physicsinformed machine learning techniques extending the stateofart Neural Operator. As a proofofconcept study, we built Fourier Neural Operator (FNO) surrogates on the simulations of an industrystandard flood and ocean model (NEMO). The resulting FNO surrogate accurately predicts the sea surface height in most regions while achieving upwards of 45x acceleration of NEMO. We delivered an opensource CoastalTwin platform in an endtoend and modular way, to enable easy extensions to other simulations and MLbased surrogate methods. Our results and deliverable provide a promising approach to massively accelerate coastal dynamics simulators, which can enable scientists to efficiently execute many simulations for decisionmaking, uncertainty quantification, and other research activities. 
Peishi Jiang · Constantin Weisser · Björn Lütjens · Dava Newman 🔗 


Probabilistic neural networks for predicting energy dissipation rates in geophysical turbulent flows
(Poster)
SlidesLive Video » Motivated by oceanographic observational datasets, we propose a probabilistic neural network (PNN) model for calculating turbulent energy dissipation rates from vertical columns of velocity and density gradients in density stratified turbulent flows. We train and test the model on highresolution simulations of decaying turbulence designed to emulate geophysical conditions similar to those found in the ocean. The PNN model outperforms a baseline theoretical model widely used to compute dissipation rates from oceanographic observations of vertical shear, being more robust in capturing the tails of the output distributions at multiple different time points during turbulent decay. A differential sensitivity analysis indicates that this improvement may be attributed to the ability of the network to capture additional underlying physics introduced by density gradients in the flow. 
Sam Lewin 🔗 


Robust and Provably Monotonic Networks
(Poster)
The Lipschitz constant of the map between the input and output space represented by a neural network is a natural metric for assessing the robustness of the model. We present a new method to constrain the Lipschitz constant of dense deep learning models that can also be generalized to other architectures. The method relies on a simple weight normalization scheme during training which ensures every layer is 1Lipschitz. A simple residual connection can then be used to make the model monotonic in any subset of its inputs, which is useful in scenarios where domain knowledge dictates such dependence. Examples can be found in algorithmic fairness requirements or, as presented here, in the classification of particle decays. Our normalization is minimally constraining and allows the underlying architecture to maintain higher expressiveness compared to other techniques which aim to either control the Lipschitz constant of the model or ensure its monotonicity. We show how the algorithm was used to train a powerful, robust, and interpretable discriminator for heavyflavor decays in the LHCb Run 3 trigger system. 
Niklas S Nolte · Ouail Kitouni · Mike Williams 🔗 


Predicting flux in Discrete Fracture Networks via Graph Informed Neural Networks
(Poster)
Discrete Fracture Network (DFN) flow simulations are commonly used to determine the outflow in fractured media for critical applications. Here, we extend the formulation of spatial graph neural networks with a new architecture, called Graph Informed Neural Network (GINN), to speed up the Uncertainty Quantification analyses for DFNs. We show that the GINN model allows better Monte Carlo estimates of the mean and standard deviation of the outflow of a test case DFN. 
Stefano Berrone · Francesco Della Santa · Antonio Mastropietro · Sandra Pieraccini · Francesco Vaccarino 🔗 


Stronger symbolic summary statistics for the LHC
(Poster)
Analyzing the highdimensional data collected at the Large Hadron Collider experiments often requires a balance between maximizing sensitivity and maintaining interpretability by domain experts. We propose a new algorithm to construct powerful summary statistics for LHC processes in the form of simple symbolic expressions. First, we extract latent information from a chain of simulators; through symbolic regression on this data we then learn approximately sufficient statistics. Observables constructed in this way can be used as plugin replacements for established summary statistics, potentially improving the precision of scientific results without adding any overhead. In Higgs production in weak boson fusion, our algorithm rediscovers wellknown heuristics and proposes new, moderately complex formulas that rival the new physics reach of neural networks. 
Nathalie Soybelman · Anja Butter · Tilman Plehn · Johann Brehmer 🔗 


Photometric Redshifts for Cosmology: Improving Accuracy and Uncertainty Estimates Using Bayesian Neural Networks
(Poster)
SlidesLive Video » We present results exploring the role that probabilistic deep learning models can play in cosmology from large scale astronomical surveys through estimating the distances to galaxies (redshifts) from photometry. Due to the massive scale of data coming from these new and upcoming sky surveys, machine learning techniques using galaxy photometry are increasingly adopted to predict galactic redshifts which are important for inferring cosmological parameters such as the nature of Dark Energy. Associated uncertainty estimates are also critical measurements, however, common machine learning methods typically provide only point estimates and lack uncertainty information as outputs. We turn to Bayesian Neural Networks (BNNs) as a promising way to provide accurate predictions of redshift values. We have compiled a new galaxy training dataset from the Hyper SuprimeCam Survey, designed to mimic large surveys, but over a smaller portion of the sky. We evaluate the performance and accuracy of photometric redshift (photoz) predictions from photometry using machine learning, astronomical and probabilistic metrics. We find that while the Bayesian Neural Network did not perform as well as nonBayesian Neural Networks if evaluated solely by point estimate photoz values, BNNs can provide uncertainty estimates that are necessary for cosmology. 
Evan Jones 🔗 


Symmetries and selfsupervision in particle physics
(Poster)
SlidesLive Video » A longstanding problem in the design of machinelearning tools for particle physics applications has been how to incorporate prior knowledge of physical symmetries. In this note we propose contrastive selfsupervision as a solution to this problem, with jet physics as an example. Using a permutationinvariant transformer network, we learn a representation which outperforms handcrafted competitors on a linear classification benchmark. 
Barry M Dillon · Tilman Plehn · Gregor Kasieczka 🔗 


Cooperative multiagent reinforcement learning outperforms decentralized execution in highdimensional nonequilibrium control for steadystate design
(Poster)
SlidesLive Video » Experimental advances enabling highresolution external control create new opportunities to produce materials with exotic properties. In this work, we investigate how a multiagent reinforcement learning approach can be used to design external control protocols for selfassembly. We find that a fully decentralized approach performs remarkably well even with a "coarse" level of external control. More importantly, we see that a partially decentralized approach, where we include information about surrounding regions allows us to better control our system towards some target distribution. We explain this by analyzing our approach as a partiallyobserved Markov decision process. With a partially decentralized approach, the agent is able to act more presciently, both by preventing the formation of undesirable structures and by better stabilizing target structures, as compared to a fully decentralized approach. 
Shriram Chennakesavalu · Grant Rotskoff 🔗 


Supplementing Recurrent Neural Network Wave Functions with Symmetry and Annealing to Improve Accuracy
(Poster)
Recurrent neural networks (RNNs) are a class of neural networks that have emerged from the paradigm of artificial intelligence and has enabled lots of interesting advances in the field of natural language processing. Interestingly, these architectures were shown to be powerful ansatze to approximate the ground state of quantum systems [1]. Here, we build over the results of Ref. [1] and construct a more powerful RNN wave function ansatz in two dimensions. We use symmetry and annealing to obtain accurate estimates of ground state energies of the twodimensional (2D) Heisenberg model, on the square lattice and on the triangular lattice. We show that our method is superior to Density Matrix Renormalisation Group (DMRG) for system sizes larger than or equal to 12x12 on the triangular lattice. [1] M. HibatAllah, M. Ganahl, L. E. Hayward, R. G. Melko, and J. Carrasquilla, "Recurrent neural network wave functions," Physical Review Research, Jun 2020. 
Mohamed Hibat Allah · Juan Carrasquilla · Roger Melko 🔗 


An Imperfect machine to search for New Physics: systematic uncertainties in a machinelearning based signal extraction
(Poster)
We show how to deal with uncertainties on the Reference Model predictions in a signalmodelindependent new physics search strategy based on artificial neural networks. Our approach builds directly on the Maximum Likelihood ratio treatment of uncertainties as nuisance parameters for hypothesis testing that is routinely employed in highenergy physics. After presenting the conceptual foundations of the method, we show its applicability in a multivariate setup by studying the impact of two typical sources of experimental uncertainties in twobody final states at the LHC. 
Gaia Grosso · Maurizio Pierini 🔗 


Optimizing HighDimensional Physics Simulations via Composite Bayesian Optimization
(Poster)
SlidesLive Video » Physical simulationbased optimization is a common task in science and engineering. Many such simulations produce image or tensor based outputs where the desired objective is a function of that image with respect to a highdimensional parameter space. %some parameters. We develop a Bayesian optimization method leveraging tensorbased Gaussian process surrogates and trust region Bayesian optimization to effectively model the image outputs and to efficiently optimize these types of simulations, including an optical design problem and a radiofrequency tower configuration problem. 
Wesley Maddox · Qing Feng · Max Balandat 🔗 


SimulationBased Inference of Strong Gravitational Lensing Parameters
(Poster)
In the coming years, a new generation of sky surveys, in particular, Euclid Space Telescope (2022), and the Rubin Observatory’s Legacy Survey of Space and Time (LSST, 2023) will discover more than 200,000 new strong gravitational lenses, which represents an increase of more than two orders of magnitude compared to currently known sample sizes. Accurate and fast analysis of such large volumes of data under a robust statistical framework is therefore crucial for all sciences enabled by strong lensing. Here, we report on the application of simulationbased inference methods, in particular, density estimation techniques, to the predictions of the set of parameters of strong lensing systems from neural networks. This allows us to explicitly impose desired priors on lensing parameters, while guaranteeing convergence to the optimal posterior in the limit of perfect performance. 
Ronan Legin 🔗 


Out of equilibrium learning dynamics in physical allosteric resistor networks
(Poster)
Physical networks can learn desirable functions using local learning rules in space and time. Real learning systems, like natural neural networks, can learn out of equilibrium, on timescales comparable to their physical relaxation. Here we study coupled learning, a framework that supports learning in equilibrium in diverse physical systems. Relaxing the equilibrium assumption, we study experimentally and theoretically how physical resistor networks learn allosteric functions far from equilibrium. We show how fast learning produces oscillatory dynamics beyond a critical threshold, and that learning succeeds well beyond that threshold. These findings show how coupled learning rules may train systems much faster than assumed before, suggesting their applicability to slowly relaxing physical systems. 
Nachi Stern · Andrea Liu 🔗 


A New sPHENIX Heavy Quark Trigger Algorithm Based on Graph Neutral Networks
(Poster)
SlidesLive Video » Triggering plays a vital role in high energy nuclear and particle physics experiments. Here we propose a new trigger system design for heavy charm quark events in proton+proton (p+p) collisions in the sPHENIX experiment at the Relativistic Heavy Ion Collider (RHIC). This trigger system selects a charm event created in p+p collision by identifying the topology of a charmhadron (D^0) decays into a pair of oppositely charged kaon and pion particles. Classical approaches are based on statistical models, relying on complex handdesigned features, and are both costprohibitive and inflexible for discovering charm events from a large background of other collision events. The proposed neural network based trigger system takes into account unique high level features of charm events, using a stack of images that are embedded in a deep neural network. By incorporating two stateoftheart graph neural networks, ParticleNet and SAGPool, we can learn highlevel physics features and perform binary classification with simple geometrical track information. Our model attains nearly 75% accuracy and only requires moderate resources. With a small number neurons and simple input, our model is designed to be compatible with FPGAs and thereby enables extremely fast decision modules for realtime p+p collision events in the upcoming sPHENIX experiment at RHIC. 
Yimin Zhu · Tingting Xuan 🔗 


Particle Graph Autoencoders and Differentiable, Learned Energy Mover's Distance
(Poster)
SlidesLive Video » Autoencoders have useful applications in high energy physics in both compression and anomaly detection, particularly for jets: collimated showers of particles produced in collisions such as those at the CERN Large Hadron Collider. We explore the use of graphbased autoencoders, which operate on jets in their "particle cloud" representations and can leverage the interdependencies among the particles within jets, for such tasks. Additionally, we develop a differentiable approximation to the energy mover's distance via a graph neural network, which may subsequently be used as a reconstruction loss function for autoencoders. 
Steven Tsan · Sukanya Krishna · Raghav Kansal · Anthony Aportela · Farouk Mokhtar · Daniel Diaz · Javier Duarte · Maurizio Pierini · jeanroch vlimant 🔗 


Rethinking Neural Networks with Benford's Law
(Poster)
SlidesLive Video » Benford's Law (BL) or the Significant Digit Law defines the probability distribution of the first digit of numerical values in a data sample. This Law is observed in many naturally occurring datasets. It can be seen as a measure of naturalness of a given distribution and finds its application in areas like anomaly and fraud detection. In this work, we address the following question: Is the distribution of the Neural Network parameters related to the network's generalization capability? To that end, we first define a metric, MLH (Model Enthalpy), that measures the closeness of a set of numbers to BL. Second, we use MLH as an alternative to Validation Accuracy for Early Stopping, removing the need for a Validation set. We provide experimental evidence that even if the optimal size of the validation set is known beforehand, the peak test accuracy attained is lower than not using a validation set at all. Finally, we investigate the connection of BL to FreeEnergy Principle and First Law of Thermodynamics, showing that MLH is a component of the internal energy of the learning system and optimization as an analogy to minimizing the total energy to attain equilibrium. 
Surya Kant Sahu · Abhinav Java · Arshad Shaikh 🔗 


A Convolutional AutoencoderBased Pipeline For Anomaly Detection And Classification Of Periodic Variables
(Poster)
SlidesLive Video »
The periodic pulsations of stars teach us about their underlying physical process. We present a convolutional autoencoderbased pipeline as an automatic approach to search for outofdistribution anomalous periodic variables within the Zwicky Transient Facility (ZTF) catalog of periodic variables. We use an isolation forest to rank each periodic variable by the anomaly score. Our overall most anomalous events have a unique physical origin: they are mostly, red, cool, high variability, and irregularly oscillating periodic variables. Observational data suggest that they are most likely young and massive ($\simeq510$M$_\odot$) Red Giant or Asymptotic Giant Branch stars. Furthermore, we use the learned latent feature for the classification of periodic variables through a hierarchical random forest. This novel semisupervised approach allows astronomers to identify the most anomalous events within a given physical class, significantly increasing the potential for scientific discovery.

Leon Chan · Siu Hei Cheung · Shirley Ho 🔗 


PartialAttribution Instance Segmentation for Astronomical Source Detection and Deblending
(Poster)
SlidesLive Video » Astronomical source deblending is the process of separating the contribution of individual stars or galaxies (sources) to an image comprised of multiple, possibly overlapping sources. Astronomical sources display a wide range of sizes and brightnesses and may show substantial overlap in images. Astronomical imaging data can further challenge offtheshelf computer vision algorithms owing to its high dynamic range, low signaltonoise ratio, and unconventional image format. These challenges make source deblending an open area of astronomical research, and in this work, we introduce a new approach called PartialAttribution Instance Segmentation that enables source detection and deblending in a manner tractable for deep learning models. We provide a novel neural network implementation as a demonstration of the method. 
Ryan Hausen 🔗 


Electromagnetic Counterpart Identification of Gravitationalwave candidates using deeplearning
(Poster)
SlidesLive Video » Both timedomain and gravitationalwave (GW) astronomy have gone through a revolution in the last decade. These two previously disjoint fields converged when the electromagnetic (EM) counterpart of a binary neutron star merger, GW170817, was discovered in 2017. However, despite the discovery rate of GWs steadily increasing, by several folds in each observing run of the LIGO/Virgo GW instruments, GW170817 remains the only success story of EMGW astronomy. While future GW detectors will detect even larger number of events, this does not guarantee corresponding increase in the number of EM counterparts discovered. In fact, the growing number is overwhelming since widefield telescope surveys will have to contend with distinguishing the optical EM counterpart, called a kilonova, from the ever increasing number of ``vanilla'' transients objects they encounter during a GW followup operation. To this end, we present a novel tool based on a temporal convolutional network (TCN) architecture for Electromagnetic Counterpart Identification (ElCID). The overarching goal of ElCID is to slice through list of objects that are consistent with the GW sky localization, and determine which sources are consistent with kilonovae, allowing limited and judicious use of telescope and spectroscopic resources. Our classifier is trained on sparse earlytime photometry and contextual information available during discovery. Apart from verifying our model on an extensive testing sample, we also show succesful results on real events during the previous LIGO/Virgo observing runs. 
Deep Chatterjee 🔗 


CrossModal Virtual Sensing for Combustion Instability Monitoring
(Poster)
In many cyberphysical systems, imaging can be an important but expensive or 'difficult to deploy' sensing modality. One such example is detecting combustion instability using flame images, where deep learning frameworks have demonstrated stateoftheart performance. The proposed frameworks are also shown to be quite trustworthy such that domain experts can have sufficient confidence to use these models in real systems to prevent unwanted incidents. However, flame imaging is not a common sensing modality in engine combustors today. Therefore, the current roadblock exists on the hardware side regarding the acquisition and processing of highvolume flame images. On the other hand, the acoustic pressure time series is a more feasible modality for data collection in real combustors. To utilize acoustic time series as a sensing modality, we propose a novel crossmodal encoderdecoder architecture that can reconstruct crossmodal visual features from acoustic pressure time series in combustion systems. With the "distillation" of crossmodal features, the results demonstrate that the detection accuracy can be enhanced using the virtual visual sensing modality. By providing the benefit of crossmodal reconstruction, our framework can prove to be useful in different domains well beyond the power generation and transportation industries. 
Tryambak Gangopadhyay · Vikram Ramanan · Chakravarthy S.R. · Soumik Sarkar 🔗 


PlasmaNet: a framework to study and solve elliptic differential equations using neural networks in plasma fluid simulations
(Poster)
SlidesLive Video » Elliptic partial differential equations (PDEs) are common in many areas of physics, from the Poisson equation in plasmas and incompressible flows to the Helmholtz equation in electromagnetism. Their numerical solution requires to solve linear systems which can become a bottleneck in terms of performance. The rise of computational power and inherent speed of GPUs offers exciting opportunities to solve PDEs by recasting them in terms of optimization problems. In plasma fluid simulations, the Poisson equation is solved, coupled to the charged species transport equations. We introduce PlasmaNet (https://gitlab.com/cerfacs/plasmanet), an opensource library written to study neural networks in plasma simulations. Previous work using PlasmaNet has shown significant speedup using neural networks to solve the Poisson equation compared to classical linear system solvers on this problem. Results also showed that coupling the neural network Poisson solver to plasma transport equations is a viable option in terms of accuracy. In this work, we attempt to solve a new class of elliptic differential equations, the screened Poisson equations using neural networks. These equations are used to infer the photoionization source term from the ionization rate in streamer discharges. The same methodology as that adopted for the Poisson equation is followed. A simulation running with three neural networks, one to solve the Poisson equation and two to solve the photoionization equations, yields accurate results, extending the range of applicability of the method developed previously. 
Lionel Cheng · Michaël Bauerheim 🔗 


Equivariant and Modular DeepSets with Applications in Cluster Cosmology
(Poster)
We design modular and rotationally equivariant DeepSets for predicting a continuous background quantity from a set of known foreground particles. Using this architecture, we address a crucial problem in Cosmology: modelling the continuous electron pressure field inside massive structures known as “clusters.” Given a simulation of pressureless, dark matter particles, our networks can directly and accurately predict the background electron pressure field. The modular design of our architecture makes it possible to physically interpret the individual components. Our most powerful deterministic model improves by 70% on the benchmark. A conditionalVAE extension yields further improvement by 7%, being limited by our small training set however. We envision use cases beyond theoretical cosmology, for example in soft condensed matter physics, or meteorology and climate science. 
Leander Thiele · Miles Cranmer · Shirley Ho · David Spergel 🔗 


A debiasing framework for deep learning applied to the morphological classification of galaxies
(Poster)
SlidesLive Video » The morphologies of galaxies and their relation with physical features have been extensively studied in the past. Galaxy morphology labels are usually created by humans and are used to train machine learning models. Human labels have been shown to contain biases in terms of observational parameters such as the resolution of the labeled images. In this work, we demonstrate that deep learning models trained on biased galaxy data produce biased predictions. We also propose a method to train neural networks that takes into account this inherent labeling bias. We show that our deep debiasing method is able to reduce the bias of the models even when trained using biased data. 
Esteban Medina · Guillermo CabreraVives 🔗 


The Quantum Trellis: A classical algorithm for sampling the parton shower with interference effects
(Poster)
Simulations of highenergy particle collisions, such as those used at the Large Hadron Collider, are based on quantum field theory; however, many approximations are made in practice. For example, the simulation of the parton shower, which gives rise to objects called `jets', is based on a semiclassical approximation that neglects various interference effects. While there is a desire to incorporate interference effects, new computational techniques are needed to cope with the exponential growth in complexity associated to quantum processes. We present a classical algorithm called the quantum trellis to efficiently compute the unnormalized probability density over Nbody phase space including all interference effects, and we pair this with an MCMCbased sampling strategy. This provides a potential path forward for classical computers and a strong baseline for approaches based on quantum computing. 
Sebastian Macaluso · Kyle Cranmer 🔗 


Variational framework for partiallymeasured physical system control
(Poster)
SlidesLive Video » To characterize a physical system to behave as desired, either its underlying governing rules must be known a priori or the system itself be accurately measured. The complexity of full measurements of the system scales with its size. When exposed to realworld conditions, such as perturbations or timevarying settings, the system calibrated for a fixed working condition might require nontrivial recalibration, a process that could be prohibitively expensive, inefficient and impractical for realworld use cases. In this work, we propose a learning procedure to obtain a desired target output from a physical system. We use Variational AutoEncoders (VAE) to provide a generative model of the system function and use this model to obtain the required input of the system that produces the target output. We showcase the applicability of our method for two datasets in optical physics and neuroscience. 
Babak Rahmani · Demetri Psaltis 🔗 


An ML Framework for Estimating Bayesian Posteriors of Galaxy Morphological Parameters
(Poster)
SlidesLive Video » Galaxy morphology is connected to various fundamental properties of a galaxy and studying the morphology of large samples of galaxies is central to understanding the relationship between morphology and the physics of galaxy formation and evolution. For the first time, we are able to use machine learning to estimate Bayesian posteriors for galaxy morphological parameters. To achieve this, GAMPEN, our machine learning framework, uses a spatial transformer network (STN), a convolutional neural network, and the MonteCarlo Dropout technique. This novel application of an STN in astronomy also enables GAMPEN to crop out most secondary galaxies in the frame and focus on the galaxy of interest. We also demonstrate that by first training on simulations and then performing transfer learning using real data, we are able to achieve excellent estimates for morphological parameters of galaxies in the Hyper SuprimeCam Wide survey, while using only a small amount of real training data. 
Aritra Ghosh 🔗 


Analyzing HighResolution Clouds and Convection using MultiChannel VAEs
(Poster)
SlidesLive Video » Understanding the details of smallscale convection and storm formation is crucial to accurately represent the largerscale planetary dynamics. Presently, atmospheric scientists run highresolution, stormresolving simulations to capture these kilometerscale weather details. However, because they contain abundant information, these simulations can be overwhelming to analyze using conventional approaches. This paper takes a datadriven approach and jointly embeds spatial arrays of vertical wind velocities, temperatures, and water vapor information as three "channels" of a VAE architecture. Our "multichannel VAE" results in more interpretable and robust latent structures than earlier work analyzing vertical velocities in isolation. Analyzing and clustering the VAE's latent space identifies weather patterns and their geographical manifestations in a fully unsupervised fashion. Our approach shows that VAEs can play essential roles in analyzing highdimensional simulation data and extracting critical weather and climate characteristics. 
Harshini Mangipudi · Griffin Mooers · Mike Pritchard · Tom Beucler · Stephan Mandt 🔗 


Detecting Low Surface Brightness Galaxies with Mask RCNN
(Poster)
Low surface brightness galaxies (LSBGs), galaxies that are fainter than the dark night sky, are famously difficult to detect. However, studies of these galaxies are essential to improve our understanding of the formation and evolution of lowmass galaxies. In this work, we train a deep learning model using the Mask RCNN framework on a set of simulated LSBGs inserted into images from the Dark Energy Survey (DES) Data Release 2 (DR2). This deep learning model is combined with several conventional image preprocessing steps to develop a pipeline for the detection of LSBGs. We apply this pipeline to the full DES DR2 coadd image dataset, and preliminary results show the detection of 22 large, highquality LSBG candidates that went undetected by conventional algorithms. Furthermore, we find that the performance of our algorithm is greatly improved by including examples of false positives as an additional class during training. 
Caleb Levy 🔗 


Online Bayesian Optimization for Beam Alignment in the SECAR Recoil Mass Separator
(Poster)
The SEparator for CApture Reactions (SECAR) is a nextgeneration recoil separator system at the Facility for Rare Isotope Beams (FRIB) designed for the direct measurement of capture reactions on unstable nuclei in inverse kinematics. To maximize the performance of the device, careful beam alignment to the central ion optical axis needs to be achieved. This can be difficult to attain through manual tuning by human operators without potentially leaving the system in a suboptimal and irreproducible state. In this work, we present the first development of online Bayesian optimization with a Gaussian process model to tune an ion beam through a nuclear astrophysics recoil separator. We show that the method achieves small incoming angular deviations (01 mrad) in an efficient and reproducible manner that is at least 3 times faster than standard handtuning. This method is now routinely used for all separator tuning. 
Sara Miskovich 🔗 


Implicit Quantile Neural Networks for Jet Simulation and Correction
(Poster)
Reliable modeling of conditional densities is important for quantitative scientific fields such as particle physics. In domains outside physics, implicit quantile neural networks (IQN) have been shown to provide accurate models of conditional densities. We present a successful application of IQNs to jet simulation and correction using the tools and simulated data from the Compact Muon Solenoid (CMS) Open Data portal. 
Michelle Kuchera · Raghuram Ramanujan 🔗 


Realtime Detection of Anomalies in Multivariate Time Series of Astronomical Data
(Poster)
SlidesLive Video » Astronomical transients are stellar objects that become temporarily brighter on various timescales and have led to some of the most significant discoveries in cosmology and astronomy. Some of these transients are the explosive deaths of stars known as supernovae while others are rare, exotic, or entirely new kinds of exciting stellar explosions. New astronomical sky surveys are observing unprecedented numbers of multiwavelength transients, making standard approaches of visually identifying new and interesting transients infeasible. To meet this demand, we present two novel methods of quickly and automatically detecting anomalous transient light curves in realtime. Both methods are based on the simple idea that if the light curves from a known population of transients can be accurately modelled, any deviations from model predictions are likely anomalies. The first approach is a probabilistic neural network built using Temporal Convolutional Networks (TCNs) and the second is an interpretable Bayesian parametric model of a transient. We show that the flexibility of neural networks, the attribute that makes them such a powerful regressor for many tasks, is what makes them less suitable for anomaly detection when compared with our parametric model. 
Daniel Muthukrishna 🔗 


Neural network is heterogeneous: Phase matters more
(Poster)
We find a heterogeneity in both complex and real valued neural networks with the insight from wave optics, claiming a much more important role of phase in the weight matrix than its amplitude counterpart. In complexvalued neural networks, we show that among different types of pruning, the weight matrix with only phase information preserved achieves the best accuracy, which holds robustly under various depths and widths. The conclusion can be generalized to realvalued neural networks, where signs take the place of phases. These inspiring findings enrich the techniques of network pruning and binary computation. 
Yuqi Nie 🔗 


Simultaneous Multivariate Forecast of Space Weather Indices using Deep Neural Network Ensembles
(Poster)
Solar radio ﬂux along with geomagnetic indices are important indicators of solar activity and its effects. Extreme solar events such as ﬂares and geomagnetic storms can negatively affect the space environment including satellites in lowEarth orbit. Therefore, forecasting these space weather indices is of great importance in space operations and science. In this study, we propose a model based on long shortterm memory neural networks to learn the distribution of time series data with the capability to provide a simultaneous multivariate 27day forecast of the space weather indices using time series as well as solar image data. We show a 30–40% improvement of the root meansquare error while including solar image data with time series data compared to using time series data alone. Simple baselines such as a persistence and running average forecasts are also compared with the trained deep neural network models. We also quantify the uncertainty in our prediction using a model ensemble. 
Bernard Benson · Christopher Bridges · Atilim Gunes Baydin 🔗 


Symmetry Discovery with Deep Learning
(Poster)
SlidesLive Video » Symmetries are a fundamental property of functions associated with data. A key function for any dataset is its probability density, and the symmetries thereof are referred to as the symmetries of the dataset itself. We provide a rigorous statistical notion of symmetry for a dataset, which involves reference datasets that we call "inertial" in analogy to inertial frames in classical mechanics. Then, we construct a novel approach to automatically discover symmetries from a dataset using a deep learning method based on an adversarial neural network. We test our method on the LHC Olympics dataset. Symmetry discovery may lead to new insights and can reduce the effective dimensionality of a dataset to increase its effective statistics. 
Krish Desai · Benjamin Nachman · Jesse Thaler 🔗 


Approximate Bayesian Computation for Physical Inverse Modeling
(Poster)
Semiconductor device models are essential to understand the charge transport in thin film transistors (TFTs). Using these TFT models to draw inference involves estimating parameters used to fit to the experimental data. These experimental data can involve extracted charge carrier mobility or measured current. Estimating these parameters help us draw inferences about device performance. Fitting a TFT model for a given experimental data using the model parameters relies on manual fine tuning of multiple parameters by human experts. Several of these parameters may have confounding effects on the experimental data, making their individual effect extraction a nonintuitive process during manual tuning. To avoid this convoluted process, we propose a new method for automating the model parameter extraction process resulting in an accurate model fitting. In this work, model choice based approximate Bayesian computation (aBc) is used for generating the posterior distribution of the estimated parameters using observed mobility at various gate voltage values. Furthermore, it is shown that the extracted parameters can be accurately predicted from the mobility curves using gradient boosted trees. This work also provides a comparative analysis of the proposed framework with finetuned neural networks wherein the proposed framework is shown to perform better. 
Neel Chatterjee · Somya Sharma · Ansu Chatterjee 🔗 


Noether Networks: MetaLearning Useful Conserved Quantities
(Poster)
SlidesLive Video » Progress in machine learning (ML) relies on an appropriate encoding of inductive biases. Useful biases often exploit symmetries in the prediction problem, such as convolutional networks relying on translation equivariance. Automatically discovering these useful symmetries holds the potential to greatly improve the performance of ML systems, but still remains a challenge. In this work, we focus on sequential prediction problems and take inspiration from Noether's theorem to reduce the problem of finding inductive biases to metalearning useful conserved quantities. We propose Noether Networks: a new type of architecture where a metalearned conservation loss is optimized inside the prediction function. We show, theoretically and experimentally, that Noether Networks improve prediction quality, providing a framework for discovering inductive biases in sequential problems. 
Ferran Alet · Dylan Doblar · Allan Zhou · Josh Tenenbaum · Kenji Kawaguchi · Chelsea Finn 🔗 


Explaining machinelearned particleflow reconstruction
(Poster)
SlidesLive Video » The particleflow (PF) algorithm is used in generalpurpose particle detectors to reconstruct a comprehensive particlelevel view of the collision by combining information from different subdetectors. A graph neural network model, known as the MLPF algorithm, has been developed to substitute rulebased PF. However, understanding the model's decision making is not straightforward, especially given the complexity of the settoset prediction task, dynamic graph building, and messagepassing steps. In this paper, we adapt the layerwiserelevance propagation technique to the MLPF algorithm to gauge the relevant nodes and features for its predictions. Through this we gain insight into the model's decisionmaking. 
Farouk Mokhtar · Raghav Kansal · Daniel Diaz · Javier Duarte · Maurizio Pierini · jeanroch vlimant 🔗 


Physics Informed RNNDCT Networks for TimeDependent Partial Differential Equations
(Poster)
SlidesLive Video » Physicsinformed neural networks allow models to be trained by physical laws described by general nonlinear partial differential equations. However, traditional architectures struggle to solve more challenging timedependent problems. In this work, we present a novel physicsinformed framework for solving timedependent partial differential equations. Our proposed model utilizes discrete cosine transforms to encode spatial frequencies and recurrent neural networks to process the time evolution, achieving stateoftheart performance on the TaylorGreen vortex relative to other physicsinformed baseline models. 
Benjamin Wu · Oliver Hennigh · Jan Kautz · Sanjay Choudhry · Wonmin Byeon 🔗 


A Granular Method for Finding Anomalous Light Curves and their Analogs
(Poster)
Anomalous light curves indicate rare and as yet unexplainable phenomena associated with astronomical sources. With existing large surveys like the Zwicky Transient Facility (ZTF), and upcoming ones such as the Vera C. Rubin Observatory Legacy Survey of Space and Time (LSST) that will observe astrophysical transients at all time scales and produce archival light curves in the billions, there is an immediate need for methods that reveal anomalous light curves. Previous work explores anomalous light curve detection, but little work has gone into finding analogs of such light curves. That is, given a light curve of interest, can we find other examples in the dataset that behave similarly? We present such a pipeline that (1) identifies anomalous light curves, and (2) finds additional examples of specific rare classes, in a large corpora of light curves. We apply this method to Kepler data, finding around 5000 previously unknown anomalies, and present a subset of these anomalies along with their potential astrophysical classification. 
Kushal Tirumala 🔗 


Galaxy Morphological Classification with Efficient Vision Transformer
(Poster)
SlidesLive Video » Quantifying the morphology of galaxies has been an important task in astrophysics to understand the formation and evolution of galaxies. In recent years, the data size has been dramatically increasing due to several ongoing and upcoming surveys. Labeling and identifying interesting objects for further investigations has been explored by citizen science through the Galaxy Zoo Project and by machine learning in particular with the convolutional neural networks (CNNs). In this work, we explore the usage of Vision Transformer (ViT) for galaxy morphology classification for the first time. We show that ViT could reach competitive results compared with CNNs, and is specifically good at classifying smallersized and fainter galaxies. With this promising preliminary result, we believe the ViT network architecture can be an important tool for galaxy morphological classification for the next generation surveys. We plan to open source our repository in the near future. 
Joshua YaoYu Lin 🔗 


EndToEnd Online sPHENIX Trigger Detection Pipeline
(Poster)
SlidesLive Video » This paper will provide a comprehensive endtoend pipeline to classify triggers verse background events, make online decisions to filter signal data, and enable the intelligent trigger system for efficient data collection in the sPHENIX Data Acquisition System(DAQ). The pipeline starts with the coordinates of pixel hits that are lightened by passing particles in the detector, applies threestages of event processing (hits clustering, track reconstruction, and trigger detection), and finally, labels all processed events with the binary tag of trigger v.s. background events. The whole pipeline consists of deterministic algorithms such as clustering pixels to reduce event size, tracking reconstruction to predict candidate edges, and advanced graph neural networkbased models for recognizing the entire jet pattern. In particular, we apply the Massage Passing Graph Neural Network to predict links between hits and reconstruct tracks and a hierarchical pool algorithm (DiffPool) to make the graphlevel trigger detection. We attain an impressive performance ( larger than 70% accuracy) for trigger detection with only 3200 neuron weights in the endtoend pipeline. 
Tingting Xuan · Yimin Zhu 🔗 


Accelerator Tuning with Deep Reinforcement Learning
(Poster)
SlidesLive Video » Particle accelerators require routine tuning during operation and when new isotope species are introduced. This is a complex process requiring many hours from experienced operators. The difficult control aspect of this problem is challenging for traditional approaches, but offers to be a promising candidate for reinforcement learning. We aim to develop an automated tuning procedure for the accelerators at TRIUMF, starting with the OffLine Ion Source (OLIS) portion of the Isotope Separator and Accelerator (ISAC) facility. In this early stage of research, we show that the method of Recurrent Deep Deterministic Policy Gradients (RDPG) is successful in learning accelerator tuning procedures for a simple simulated environment representing the OLIS section. 
David Wang 🔗 


Geometric Priors for Scientific Generative Models in Inertial Confinement Fusion
(Poster)
SlidesLive Video » In this paper, we develop a Wasserstein autoencoder (WAE) with a hyperspherical prior for multimodal data in the application of inertial confinement fusion. Unlike a typical hyperspherical generative model that requires computationally inefficient sampling from distributions like the von Mis Fisher, we sample from a normal distribution followed by a projection layer before the generator. Finally, to determine the validity of the generated samples, we exploit a known relationship between the modalities in the dataset as a scientific constraint, and study different properties of the proposed model. 
Ankita Shukla · Rushil Anirudh · Eugene Kur · Jayaraman Thiagarajan · Timo Bremer · Brian K Spears · Tammy Ma · Pavan Turaga 🔗 


Neural Tensor Contractions and the Expressive Power of Deep Neural Quantum States
(Poster)
SlidesLive Video » We establish a direct connection between general tensor networks and deep feedforward artificial neural networks. The core of our results is the construction of neuralnetwork layers that efficiently perform tensor contractions, and that use commonly adopted nonlinear activation functions. The resulting deep networks feature a number of edges that closely matches the contraction complexity of the tensor networks to be approximated. In the context of manybody quantum states, this result establishes that neuralnetwork states have strictly the same or higher expressive power than practically usable variational tensor networks. As an example, we show that all matrix product states can be efficiently written as neuralnetwork states with a number of edges polynomial in the bond dimension and depth logarithmic in the system size. The opposite instead does not hold true, and our results imply that there exist quantum states that are not efficiently expressible in terms of matrix product states or practically usable PEPS, but that are instead efficiently expressible with neural network states. 
Or Sharir · Amnon Shashua · Giuseppe Carleo 🔗 


Fast synthesis and inversion of spectral lines in stellar chromospheres with graph networks
(Poster)
SlidesLive Video »
The physical properties of the outer layers of stellar atmospheres (temperature, velocity and/or magnetic field) can be inferred by inverting the radiative transfer forward problem. The main obstacle is that the model required to synthesize the strong lines that sample the stellar chromospheres is extremely time consuming, which makes the solution of the inverse problem not very practical. Here we leverage graph networks to predict the population number density of the atom energy levels simply from the temperature and optical depth stratification. We demonstrate that a speedup of a factor 10$^3$ can be obtained with a negligible impact on precision. This opens up the possibility of largescale synthesis in threedimensional models and routine inversion of observations to infer the 3D properties of the solar and stellar chromospheres.

Andreu Vicente Arevalo 🔗 


Embedding temporal error propagation on CNN for unsteady flow simulations
(Poster)
SlidesLive Video » This work investigates the interaction between a fluid solver with a CNNbased Poisson solver for unsteady incompressible flow simulations. During training, the network prediction is used to continue in time the computation, embedding the influence of the network prediction on the simulation using a longterm loss. This study investigates three implementations of such a loss, as well as the number of lookahead iterations. On all test cases, results show that longterm losses are always beneficial. Interestingly, a partial implementation without differentiable solver is found accurate, robust and less costly than full implementation. 
Ekhi Ajuria Illarramendi · Michaël Bauerheim 🔗 


Phenomenological classification of the Zwicky Transient Facility astronomical event alerts
(Poster)
The Zwicky Transient Facility (ZTF), a stateoftheart optical robotic sky survey, registers on the order of a million transient events  such as supernova explosions, changes in brightness of variable sources, or moving object detections  every clear night, generating realtime alerts. We present AlertClassifying Artificial Intelligence (ACAI), an opensource deeplearning framework for the phenomenological classification of the ZTF alerts. ACAI uses a set of five binary classifiers to characterize objects, which in combination with the auxiliary/contextual event information available from alert brokers, provides a powerful tool for alert stream filtering tailored to different science cases, including early identification of supernovalike and anomalous transient events. We report on the performance of ACAI during the first months of deployment in a production setting. 
Dmitry Duev 🔗 


An Emulation Framework for Fire Front Spread
(Poster)
SlidesLive Video » Forecasting bushfire spread is an important element in fire prevention and response efforts. Empirical observations of bushfire spread can be used to estimate fire response under certain conditions. These observations form rateofspread models, which can be used to generate simulations. We use machine learning to drive the emulation approach for bushfires and show that emulation has the capacity to closely reproduce simulated firefront data. We present a preliminary emulator approach with the capacity for fast emulation of complex simulations. Large numbers of predictions can then be generated as part of ensemble estimation techniques  which provide more robust and reliable forecats of stochastic systems. 
Andrew Bolt · Petra Kuhnert · Joel Dabrowski 🔗 


Scorebased Graph Generative Model for Neutrino Events Classification and Reconstruction
(Poster)
The IceCube Neutrino Observatory is an astroparticle physics experiment to investigate neutrinos from the universe. Our task is to classify neutrinos events and reconstruct events of interest. Graph Neural Network (GNN) has achieved great success in this area due to its powerful modeling ability for the irregular grid structure of the detectors. Unlike existing GNNbased methods, which neglect the quality of the constructed graph for the GNN to operate on, we focus on the graph construction step via the scorebased generative model to enhance the performance of downstream tasks. Extensive experiments verify the efficacy of our method. 
Yiming Sun · Zixing Song · Irwin King 🔗 


E(2) Equivariant SelfAttention for Radio Astronomy
(Poster)
SlidesLive Video » In this work we introduce groupequivariant selfattention models to address the problem of explainable radio galaxy classification in astronomy. We evaluate various orders of both cyclic and dihedral equivariance, and show that including equivariance as a prior both reduces the number of epochs required to fit the data and results in improved performance. We highlight the benefits of equivariance when using selfattention as an explainable model and illustrate how equivariant models statistically attend the same features in their classifications as human astronomers. 
Micah Bowles 🔗 


Unbiased Monte Carlo Cluster Updates with Autoregressive Neural Networks
(Poster)
Efficient sampling of complex highdimensional probability densities is a central task in computational science. Machine learning techniques based on autoregressive neural networks provide good approximations to probability distributions of interest in physics. In this work, we propose a systematic way to make this approximation unbiased by using it as an automatic generator of Markov chain Monte Carlo cluster updates. Symmetry enforcing and variablesize cluster updates are found to be essential to the success of this technique. We test our method for first and secondorder phase transitions of classical spin systems, showing its viability for critical systems and in the presence of metastable states. 
Dian Wu 🔗 


Using Deep Learning for estimation of river surface elevation from photogrammetric Digital Surface Models
(Poster)
SlidesLive Video » Development of the new methods of surface water observation is crucial in the perspective of increasingly frequent extreme hydrological events related to global warming and increasing demand for water. Orthophotos and digital surface models (DSMs) obtained using UAV photogrammetry can be used to determine the water surface height of a river. However, this task is difficult due to disturbances of the water surface on DSMs caused by limitations of photogrammetric algorithms. In this study, machine learning models were used to extract a single water surface elevation value. A brand new dataset has been prepared specifically for this purpose by hydrology and photogrammetry experts. The new method is an important step toward automating water surface level measurements with high spatial and temporal resolution. Such data can be used to validate and calibrate of hydrological, hydraulic and hydrodynamic models making hydrological forecasts more accurate, in particular predicting extreme and dangerous events such as floods or droughts. For our knowledge this is the first approach in which dataset was created for this purpose and deep learning models were used for this task. The obtained results have better accuracy compared to manual methods of determining WSE (watersurfaceelevation) from photogrammetric DSMs. Additionally, neuroevolution algorithm was employed to explore different architectures to find optimal models. 
Marcin Pietroń 🔗 


Crystal graph convolutional neural networks for persite property prediction
(Poster)
Graph convolutional neural networks (GCNNs) have been shown to accurately predict materials properties by featurizing local atomic environments. However, such models have not yet been utilized for predicting persite features such as Bader charge, magnetic moment, or siteprojected band centers. In this work, we develop a persite crystal graph convolutional neural network that predicts a wide array of persite properties. This model outperforms a perelement average baseline, and is thus capturing the effect of the neighborhood around each atom. Using magnetic moments as a case study, we explore an example of underlying physics the persite model is able to learn. 
Jessica Karaguesian · Jaclyn Lunger · Rafa GomezBombarelli 🔗 


Differentiable Strong Lensing for Complex Lens Modelling
(Poster)
SlidesLive Video » Strong lensing is a stunning physics phenomenon through which the light emitted from a distant cosmological source is distorted by the gravitational field of a foreground object distributed along the line of sight. Strong lensing observations are important, since, from their analysis, it is possible to infer properties of both the lightemitting source and the lens. In particular, precise lens modelling allows for the extraction of precious information on the distribution of dark matter in galaxies and clusters, which can provide tight constraints on several cosmological parameters. In this work, we consider the case where a comprehensive closedform parametric model of the lens potential is only partially available, and we propose to model missing mass along the lineofsight with a deep neural network. We incorporate the network within a fully differentiable, physically sound strong lensing simulator, and we train it via maximum likelihood estimation in an endtoend fashion. Our experiments show that the model is able to effectively interact with the other components of the simulator and can successfully retrieve the underlying potential without any assumption on its form. 
Luca Biggio 🔗 


Probing the Structure of String Theory Vacua with Genetic Algorithms and Reinforcement Learning
(Poster)
SlidesLive Video » Identifying string theory vacua with desired physical properties at low energies requires searching through highdimensional solution spaces  collectively referred to as the string landscape. We highlight that this search problem is amenable to reinforcement learning and genetic algorithms. In the context of flux vacua, we are able to reveal novel features (suggesting previously unidentified symmetries) in the string theory solutions required for properties such as the string coupling. In order to identify these features robustly, we combine results from both search algorithms, which we argue is imperative for reducing sampling bias. 
Andreas Schachner · Sven Krippendorf · Alex Cole · Gary Shiu 🔗 


DeepSWIM: A fewshot learning approach to classify Solar WInd Magnetic field structures
(Poster)
SlidesLive Video » The solar wind consists of charged particles ejected from the Sun into interplanetary space and towards Earth. Understanding the magnetic field of the solar wind is crucial for predicting future space weather and planetary atmospheric loss. A lack of labeled data makes an automated detection of these discontinuities challenging. We propose DeepSWIM, an approach leveraging advances in contrastive learning, pseudolabeling and online hard example mining to robustly identify discontinuities in solar wind magnetic field data. Through a systematic ablation study, we show that we can accurately classify discontinuities despite learning from only limited labeled data. Additionally, we show that our approach generalizes well and produces results that agree with expert handlabeling. 
Sudeshna Boro Saikia · Hala Lamdouar · Sairam Sundaresan · Anna Jungbluth · Marcella Scoczynski Ribeiro Martins · Anthony Sarah · Andres MunozJaramillo 🔗 


Factorized Fourier Neural Operators
(Poster)
SlidesLive Video » The Fourier Neural Operator (FNO) is a learningbased method for efficiently simulating partial differential equations. We propose the Factorized Fourier Neural Operator (FFNO) that allows much better generalization with deeper networks. With a careful combination of the Fourier factorization, weight sharing, the Markov property, and residual connections, FFNOs achieve a sixfold reduction in error on the most turbulent setting of the NavierStokes benchmark dataset. We show that our model maintains an error rate of 2% while still running an order of magnitude faster than a numerical solver, even when the problem setting is extended to include additional contexts such as viscosity and timevarying forces. This enables the same pretrained neural network to model vastly different conditions. 
Alasdair Tran · Alexander Mathews · Lexing Xie · Cheng Soon Ong 🔗 


Equivariant graph neural networks as surrogate for computational fluid dynamics in 3D artery models
(Poster)
SlidesLive Video » Computational fluid dynamics (CFD) is an invaluable tool in modern physics but the timeintensity and computational complexity limit its applicability to practical problems, e.g. in medicine. Surrogate methods could speed up inference and allow use in such timecritical applications. We consider the problem of estimating hemodynamic quantities (i.e. related to blood flow) on the surface of 3D artery geometries and employ anisotropic graph convolution in an endtoend SO(3)equivariant neural network operating directly on the polygonal surface mesh. We show that our network can accurately predict hemodynamic vectors for each vertex on the surface mesh with normalised mean absolute error of 0.6 [%] and approximation accuracy of 90.5 [%], demonstrating its feasibility as surrogate method for CFD. 
Julian Suk · Phillip Lippe · Christoph Brune · Jelmer Wolterink 🔗 


Probabilistic segmentation of overlapping galaxies for large cosmological surveys.
(Poster)
SlidesLive Video » EncoderDecoder networks such as UNets have been applied successfully in a wide range of computer vision tasks, especially for image segmentation of different flavours across different fields. Nevertheless, most applications lack of a satisfying quantification of the uncertainty of the prediction. Yet, a well calibrated segmentation uncertainty can be a key element for scientific applications such as precision cosmology. In this ongoing work, we explore the use of the probabilistic version of the \unetend, recently proposed by Hohl et al (2018), and adapt it to automate the segmentation of galaxies for large photometric surveys. We focus especially on the probabilistic segmentation of overlapping galaxies, also known as blending. We show that, even when training with a single ground truth per input sample, the model manages to properly capture a pixelwise uncertainty on the segmentation map. Such uncertainty can then be propagated further down the analysis of the galaxy properties. To our knowledge, this is the first time such an experiment is applied for galaxy deblending in astrophysics. 
Hubert Bretonniere · Marc HuertasCompany 🔗 


TurboSim: a generalised generative model with a physical latent space
(Poster)
We present TurboSim, a generalised autoencoder framework derived from principles of information theory that can be used as a generative model. By maximising the mutual information between the input and the output of both the encoder and the decoder, we are able to rediscover the loss terms usually found in adversarial autoencoders as well as various more sophisticated related models. Our generalised framework makes these models mathematically interpretable and allows for a diversity of new ones by setting the weight of each term separately. The framework is also independent of the intrinsic architecture of the encoder and the decoder thus leaving a wide choice for the building blocks of the whole network. We apply TurboSim to a collider physics generation problem: the transformation of the properties of several particles from a theory space, right after the collision, to an observation space, right after the detection in an experiment. 
Guillaume Quétant · Vitaliy Kinakh · Tobias Golling · Slava Voloshynovskiy 🔗 


Deterministic particle flows for constraining SDEs
(Poster)
Devising optimal interventions for diffusive systems often requires the solution of the HamiltonJacobiBellman ({HJB}) equation, a nonlinear backward partial differential equation ({PDE}), that is, in general, nontrivial to solve. Existing control methods either tackle the HJB directly with gridbased PDE solvers, or resort to iterative stochastic path sampling to obtain the necessary controls. Here, we present a framework that interpolates between these two approaches. By reformulating the optimal interventions in terms of logarithmic gradients (\emph{scores}) of two forward probability flows, and by employing deterministic particle methods for solving FokkerPlanck equations, we introduce a novel \emph{deterministic} particle framework that computes the required optimal interventions in \emph{oneshot}. 
Dimitra Maoutsa · Manfred Opper 🔗 


Graph Segmentation in Scientific Datasets
(Poster)
Deep learning tools are being used extensively in a range of scientific domains; in particular, there has been a steady increase in the number of geometric deep learning solutions proposed to a variety of problems involving structured or relational scientific data. In this work, we report on the performance of graph segmentation methods for two scientific datasets from different fields. Based on observations, we were able to discern the individual impact each type of graph segmentation methods has on the dataset and how they can be used as a precursors to deep learning pipelines. 
Rajat Sahay · Savannah Thais 🔗 


CaloDVAE : Discrete Variational Autoencoders for Fast Calorimeter Shower Simulation
(Poster)
Calorimeter simulation is the most computationally expensive part of Monte Carlo generation of samples necessary for analysis of experimental data at the Large Hadron Collider (LHC). The HighLuminosity upgrade of the LHC would require an even larger amount of such samples. We present a technique based on Discrete Variational Autoencoders (DVAEs) to simulate particle showers in Electromagnetic Calorimeters. We discuss how this work paves the way towards exploration of quantum annealing processors as sampling devices for generation of simulated High Energy Physics datasets. 
Abhishek Abhishek 🔗 


Efficient kernel methods for modelindependent new physics searches
(Poster)
We present a novel kernelbased anomaly detection algorithm for modelindependent new physics searches. The model is based on a reweighted version of kernel logistic regression and it aims at learning the likelihood ratio test statistics from simulated anomalyfree background data and experimental data. Modelindependence is enforced by avoiding any prior assumption about the presence or shape of new physics components in the data. This is made possible by kernel methods being nonparametric models that, given enough data, can approximate any continuous function and adapt to potentially any type of anomaly. This model shows dramatic advantages compared to similar neural network implementations in terms of training times and computational resources, while showing comparable performances. We test the model on datasets of different dimensionalities showing that modern implementations of kernel methods are competitive options for large scale problems. 
Marco Letizia · Lorenzo Rosasco · Marco Rando 🔗 


Analysis of ODE2VAE with Examples
(Poster)
Ordinary Differential Equation Variational AutoEncoder (ODE2VAE) is a deep latent variable model that aims to learn complex distributions over highdimensional sequential data and their lowdimensional representations in a hierarchical latent space. The hierarchical organization of the latent space embeds a physicsguided inductive bias in the model. In this paper, we analyze the latent representations inferred by the ODE2VAE model over three different physical motion datasets: bouncing balls, projectile motion, and simple pendulum. We show that the model is able to learn meaningful latent representations to an extent without any supervision. 
Batuhan Koyuncu 🔗 


Rethinking Graph Transformers with Spectral Attention
(Poster)
In recent years, the Transformer architecture has proven to be very successful in sequence processing, but its application to other data structures, such as graphs, has remained limited due to the difficulty of properly defining positions. Here, we present the Spectral Attention Network (SAN), which uses a learned positional encoding (LPE) that can take advantage of the full Laplacian spectrum to learn the position of each node in a given graph. This LPE is then added to the node features of the graph and passed to a fullyconnected Transformer. By leveraging the full spectrum of the Laplacian, our model is theoretically powerful in distinguishing graphs, and can better detect similar substructures from their resonance. Further, by fully connecting the graph, the Transformer does not suffer from oversquashing, an information bottleneck of most GNNs, and enables better modeling of physical phenomenons such as heat transfer and electric interaction. When tested empirically on a set of 4 standard datasets, our model performs on par or better than stateoftheart GNNs, and outperforms any attentionbased model by a wide margin, becoming the first fullyconnected architecture to perform well on graph benchmarks. 
Devin Kreuzer · Will Hamilton · Vincent Létourneau 🔗 


Crystal Diffusion Variational Autoencoder for Periodic Material Generation
(Poster)
Generating the periodic structure of stable materials is a longstanding challenge for the material design community. This task is difficult because stable materials only exist in a lowdimensional subspace of all possible periodic arrangements of atoms: 1) the coordinates must lie in the local energy minimum defined by quantum mechanics, and 2) different atom types have complex, yet specific bonding preferences. Existing methods fail to incorporate these factors and often lack proper invariances. We propose a Crystal Diffusion Variational Autoencoder (CDVAE) that captures the physical inductive bias of material stability. By learning from the data distribution of stable materials, the decoder generates materials in a diffusion process that moves atomic coordinates towards a lower energy state and updates atom types to satisfy bonding preferences between neighbors. Our model also explicitly encodes interactions across periodic boundaries and respects permutation, translation, rotation, and periodic invariances. We generate significantly more realistic materials than past methods in two tasks: 1) reconstructing the input structure, and 2) generating valid, diverse, and realistic materials. Our contribution also includes the creation of several standard datasets and evaluation metrics for the broader machine learning community. 
Tian Xie · Xiang Fu · Octavian Ganea · Regina Barzilay · Tommi Jaakkola 🔗 


Selfsupervised similarity search for large scientific datasets
(Poster)
We present the use of selfsupervised learning to explore and exploit large unlabeled datasets. Focusing on 42 million galaxy images from the latest data release of the Dark Energy Spectroscopic Instrument (DESI) Legacy Imaging Surveys, we first train a selfsupervised model to distil lowdimensional representations that are robust to symmetries, uncertainties, and noise in each image. We then use the representations to construct and publicly release an interactive semantic similarity search tool. We demonstrate how our tool can be used to rapidly discover rare objects given only a single example, increase the speed of crowdsourcing campaigns, flag bad data, and construct and improve training sets for supervised applications. While we focus on images from sky surveys, the technique is straightforward to apply to any scientific dataset of any dimensionality. The similarity search web app can be found at 
George Stein 🔗 