`

Timezone: »

 
Workshop
Machine Learning and the Physical Sciences
Anima Anandkumar · Kyle Cranmer · Mr. Prabhat · Lenka Zdeborová · Atilim Gunes Baydin · Juan Carrasquilla · Emine Kucukbenli · Gilles Louppe · Benjamin Nachman · Brian Nord · Savannah Thais

Mon Dec 13 06:00 AM -- 03:30 PM (PST) @ None
Event URL: https://ml4physicalsciences.github.io »

The "Machine Learning and the Physical Sciences" workshop aims to provide a cutting-edge venue for research at the interface of machine learning (ML) and the physical sciences. This interface spans (1) applications of ML in physical sciences (“ML for physics”) and (2) developments in ML motivated by physical insights (“physics for ML”).

ML methods have had great success in learning complex representations of data that enable novel modeling and data processing approaches in many scientific disciplines. Physical sciences span problems and challenges at all scales in the universe: from finding exoplanets in trillions of sky pixels, to finding ML inspired solutions to the quantum many-body problem, to detecting anomalies in event streams from the Large Hadron Collider, to predicting how extreme weather events will vary with climate change. Tackling a number of associated data-intensive tasks including, but not limited to, segmentation, 3D computer vision, sequence modeling, causal reasoning, generative modeling, and efficient probabilistic inference are critical for furthering scientific discovery. In addition to using ML models for scientific discovery, tools and insights from the physical sciences are increasingly brought to the study of ML models.

By bringing together ML researchers and physical scientists who apply and study ML, we expect to strengthen the interdisciplinary dialogue, introduce exciting new open problems to the broader community, and stimulate the production of new approaches to solving challenging open problems in the sciences. Invited talks from leading individuals in both communities will cover the state-of-the-art techniques and set the stage for this workshop, which will also include contributed talks selected from submissions. The workshop will also feature an expert panel discussion on “Physics for ML" and a breakout session dedicated to community building will serve to foster dialogue between physical science and ML research communities.

Mon 6:00 a.m. - 6:10 a.m.
Session 1 | Opening remarks (Live intro)
Mon 6:10 a.m. - 6:35 a.m.
Session 1 | Invited talk: Max Welling, "Accelerating simulations of nature, both classical and quantum, with equivariant deep learning" (Invited talk (live))
Max Welling · Atilim Gunes Baydin
Mon 6:35 a.m. - 6:45 a.m.
Session 1 | Invited talk Q&A: Max Welling (Live Q&A)
Mon 6:45 a.m. - 7:10 a.m.
Session 1 | Invited talk: Bingqing Cheng, "Predicting material properties with the help of machine learning" (Invited talk (live))
Bingqing Cheng · Atilim Gunes Baydin
Mon 7:10 a.m. - 7:20 a.m.
Session 1 | Invited talk Q&A: Bingqing Cheng (Live Q&A)
Mon 7:20 a.m. - 7:35 a.m.
(Contributed talk (live))

Generating the periodic structure of stable materials is a long-standing challenge for the material design community. This task is difficult because stable materials only exist in a low-dimensional subspace of all possible periodic arrangements of atoms: 1) the coordinates must lie in the local energy minimum defined by quantum mechanics, and 2) different atom types have complex, yet specific bonding preferences. Existing methods fail to incorporate these factors and often lack proper invariances. We propose a Crystal Diffusion Variational Autoencoder (CDVAE) that captures the physical inductive bias of material stability. By learning from the data distribution of stable materials, the decoder generates materials in a diffusion process that moves atomic coordinates towards a lower energy state and updates atom types to satisfy bonding preferences between neighbors. Our model also explicitly encodes interactions across periodic boundaries and respects permutation, translation, rotation, and periodic invariances. We generate significantly more realistic materials than past methods in two tasks: 1) reconstructing the input structure, and 2) generating valid, diverse, and realistic materials. Our contribution also includes the creation of several standard datasets and evaluation metrics for the broader machine learning community.

Tian Xie · Atilim Gunes Baydin
Mon 7:35 a.m. - 9:05 a.m.
Session 1 | Poster session (Poster session (Gather.town))
Mon 9:05 a.m. - 9:10 a.m.
Session 2 | Opening remarks (Live intro)
Mon 9:10 a.m. - 10:10 a.m.
Session 2 | Panel discussion: Jennifer Chayes, Marylou Gabrié, Michela Paganini, Sara Solla, Moderator: Lenka Zdeborová (Live panel discussion)
Mon 10:10 a.m. - 10:35 a.m.
Session 2 | Invited talk: Megan Ansdell, "NASA's efforts & opportunities to support ML in the Physical Sciences" (Invited talk (live))
Megan Ansdell · Atilim Gunes Baydin
Mon 10:35 a.m. - 10:45 a.m.
Session 2 | Invited talk Q&A: Megan Ansdell (Live Q&A)
Mon 10:45 a.m. - 11:00 a.m.
(Contributed talk (live))

We present the use of self-supervised learning to explore and exploit large unlabeled datasets. Focusing on 42 million galaxy images from the latest data release of the Dark Energy Spectroscopic Instrument (DESI) Legacy Imaging Surveys, we first train a self-supervised model to distil low-dimensional representations that are robust to symmetries, uncertainties, and noise in each image. We then use the representations to construct and publicly release an interactive semantic similarity search tool. We demonstrate how our tool can be used to rapidly discover rare objects given only a single example, increase the speed of crowd-sourcing campaigns, flag bad data, and construct and improve training sets for supervised applications. While we focus on images from sky surveys, the technique is straightforward to apply to any scientific dataset of any dimensionality. The similarity search web app can be found at .

George Stein · Atilim Gunes Baydin
Mon 11:00 a.m. - 12:30 p.m.
Session 2 | Poster session (Poster session (Gather.town))
Mon 12:30 p.m. - 12:35 p.m.
Session 3 | Opening remarks (Live intro)
Mon 12:35 p.m. - 1:00 p.m.
Session 3 | Invited talk: Surya Ganguli, "From the geometry of high dimensional energy landscapes to optimal annealing in a dissipative many body quantum optimizer" (Invited talk (live))
Surya Ganguli · Atilim Gunes Baydin
Mon 1:00 p.m. - 1:10 p.m.
Session 3 | Invited talk Q&A: Surya Ganguli (Live Q&A)
Mon 1:10 p.m. - 1:35 p.m.
Session 3 | Invited talk: Laure Zanna, "The future of climate modeling in the age of machine learning" (Invited talk (live))
Laure Zanna · Atilim Gunes Baydin
Mon 1:35 p.m. - 1:45 p.m.
Session 3 | Invited talk Q&A: Laure Zanna (Live Q&A)
Mon 1:45 p.m. - 2:00 p.m.
(Contributed talk (live))

Gravitational waves (GWs) detected by the LIGO and Virgo observatories encode descriptions of their astrophysical progenitors. To characterize these systems, physical GW signal models are inverted using Bayesian inference coupled with stochastic samplers---a task that can take O(day) for a typical binary black hole. Several recent efforts have attempted to speed this up by using normalizing flows to estimate the posterior distribution conditioned on the observed data. In this study, we further develop these techniques to achieve results nearly indistinguishable from standard samplers when evaluated on real GW data, with inference times of one minute per event. This is enabled by (i) incorporating detector nonstationarity from event to event by conditioning on a summary of the noise characteristics, (ii) using an embedding network adapted to GW signals to compress data, and (iii) adopting a new inference algorithm that makes use of underlying physical equivariances.

Maximilian Dax · Atilim Gunes Baydin
Mon 2:00 p.m. - 3:00 p.m.
Session 3 | Community development breakouts (Community breakout session (Gather.town))
Mon 3:00 p.m. - 3:30 p.m.
Session 3 | Feedback from community development breakouts (Live feedback)
-
(Poster) [ Visit Poster at Spot J3 in Virtual World ]   

Floods wreak havoc throughout the world, causing billions of dollars in damages, and uprooting communities, ecosystems and economies. The NASA Impact Emerging Techniques in Computational Intelligence (ETCI) competition on Flood Detection tasked participants with predicting flooded pixels after training with synthetic aperture radar (SAR) images in a supervised setting. We propose a semi-supervised learning pseudo-labeling scheme that derives confidence estimates from U-Net ensembles, thereby progressively improving accuracy. Concretely, we use a cyclical approach involving multiple stages (1) training an ensemble model of multiple U-Net architectures with the provided high confidence hand-labeled data and, generated pseudo labels or low confidence labels on the entire unlabeled test dataset, and then, (2) filter out quality generated labels and, (3) combine the generated labels with the previously available high confidence hand-labeled dataset. This assimilated dataset is used for the next round of training ensemble models. This cyclical process is repeated until the performance improvement plateaus. Additionally, we post process our results with Conditional Random Fields. Our approach sets a high score, and a new state-of-the-art on the Sentinel-1 dataset for the ETCI competition with 0.7654 IoU, an impressive improvement over the 0.60 IOU baseline. Our method, which we release with all the code including trained models, can also be used as an open science benchmark for the Sentinel-1 released dataset.

Siddha Ganju · Sayak Paul
-
(Poster) [ Visit Poster at Spot J2 in Virtual World ]   

Characterization of the environment in which communication is taking place, termed the \emph{radio channel}, is imperative for the design and analysis of communication systems. Stochastic models of the radio channel are widely used simulation tools that construct a probabilistic model of the radio channel. Calibrating these models to new measurement data is challenging when the likelihood function is intractable. The standard approach to this problem involves sophisticated algorithms for extraction and clustering of multipath components, following which, point estimates of the model parameters can be obtained using specialized estimators. We propose instead an approximate Bayesian computation algorithm based on the maximum mean discrepancy with a kernel careful crafted for this task. The proposed method is able to estimate the parameters of the model accurately in simulations, and has the advantage that it can be used on a wide range of models.

Ayush Bharti · Francois-Xavier Briol
-
(Poster) [ Visit Poster at Spot J1 in Virtual World ]   

Seismic surveys are a valuable source of information for mineral exploration activities. We introduce a reflection seismic survey dataset acquired at four distinct hard rock mining sites to stimulate the development of new seismic data interpretation approaches. In particular, we provide annotations as well as a sound benchmarking methodology to evaluate the transferability of supervised first break picking solutions on our dataset. We train and evaluate a baseline solution based on a U-Net and discuss potential improvements to this approach.

Pierre-Luc St-Charles · Joumana Ghosn
-
(Poster) [ Visit Poster at Spot J0 in Virtual World ]

Wide-field astronomical surveys are often affected by the presence of undesirable reflections (often known as ghosting artifacts'' orghosts'') and scattered-light artifacts. The identification and mitigation of these artifacts is important for rigorous astronomical analyses of faint and low-surface-brightness systems. In this work, we use images from the Dark Energy Survey (DES) to train, validate, and test a deep neural network (Mask R-CNN) to detect and localize ghosts and scattered-light artifacts. We find that the ability of the Mask R-CNN model to identify affected regions is superior to that of conventional algorithms that model the physical processes that lead to such artifacts, thus providing a powerful technique for the automated detection of ghosting and scattered-light artifacts in current and near-future surveys.

Dimitrios Tanoglidis · Aleksandra Ciprijanovic
-
(Poster) [ Visit Poster at Spot I3 in Virtual World ]

Constructing probability densities for inference in high-dimensional spectral data is often intractable. In this work, we use normalizing flows on structured spectral latent spaces to estimate such densities, enabling downstream inference tasks. In addition, we evaluate a method for uncertainty quantification when predicting unobserved state vectors associated with each spectrum. We demonstrate the capability of this approach on laser-induced breakdown spectroscopy data collected by the ChemCam instrument on the Mars rover Curiosity. Using our approach, we are able to generate realistic spectral samples and to accurately predict state vectors with associated well-calibrated uncertainties. We anticipate that this methodology will enable efficient probabilistic modeling of spectral data, leading to potential advances in several areas, including out-of-distribution detection and sensitivity analysis.

Katiana Kontolati · Nishant Panda · Diane Oyen
-
(Poster) [ Visit Poster at Spot I2 in Virtual World ]

Deep learning models are being increasingly adopted in wide array of scientific domains, especially to handle high-dimensionality and volume of the scientific data. However, these models tend to be brittle due to their complexity and overparametrization, especially to the adversarial perturbations that can appear due to common image processing such as compression or blurring that are often seen with real scientific data. It is crucial to understand this brittleness and develop models robust to these adversarial perturbations. To this end, we study the effect of observational noise from the exposure time, as well as the worst case scenario of a one-pixel attack as a proxy for compression or telescope errors on performance of ResNet18 trained to distinguish between galaxies of different morphologies in LSST mock data. We also explore how domain adaptation techniques can help improve model robustness in case of this type of naturally occurring attacks and help scientists build more trustworthy and stable models.

Aleksandra Ciprijanovic · Diana Kafkes · Gabriel Nathan Perdue · Sandeep Madireddy · Stefan Wild · Brian Nord
-
(Poster) [ Visit Poster at Spot I1 in Virtual World ]   

Ptychography, as an essential tool for high-resolution and nondestructive material characterization, presents a challenging large-scale nonlinear and non-convex inverse problem; however, its intrinsic photon statistics create clear opportunities for statistical-based deep learning approaches to tackle these challenges, which has been underexplored. In this work, we explore normalizing flows to obtain a surrogate for the high-dimensional posterior, which also enables the characterization of the uncertainty associated with the reconstruction: an extremely desirable capability when judging the reconstruction quality in the absence of ground truth, spotting spurious artifacts and guiding future experiments using the returned uncertainty patterns. We demonstrate the performance of the proposed method on a synthetic sample with added noise and in various physical experimental settings.

Agnimitra Dasgupta
-
(Poster) [ Visit Poster at Spot I0 in Virtual World ]   

We explore model inversion using the Fourier Neural Operator (FNO) of Li et al. The approach learns a FNO emulator of the partial differential equation forward operator from simulated realisations and then the latent inputs (physical system parameters) are selected by solving an optimisation to match a set of observations. Our results suggest that this underdetermined inverse problem is substantially harder but by careful regularisation we are able to improve our inference substantially.

Daniel MacKinlay · Daniel Pagendam · Petra Kuhnert
-
(Poster) [ Visit Poster at Spot H3 in Virtual World ]

A mixture-of-experts ensemble of hierarchical deep metric learning models is introduced in order to identify materials from X-ray diffraction spectra. In previous studies, the identification accuracy of the 1D convolutional neural networks model deteriorates significantly as the number of classes increases. To overcome this problem, a hierarchical deep metric learning model was developed that can identify approximately 10,000 classes with an average top-1 accuracy of 87%. Furthermore, this new model was employed to create expert models for 73 general chemical elements, which in turn were used to construct a mixture-of-experts ensemble. This ensemble model successfully identified materials from 136,899 classes with a top-1 accuracy of 98%.

Masaki Adachi
-
(Poster) [ Visit Poster at Spot H2 in Virtual World ]   

Markov Chain Monte Carlo (MCMC) methods are widely used for Bayesian inference in astronomy. However, when applied to datasets coming from next-generation telescopes, inference becomes computationally expensive. We propose using amortized variational inference to estimate the posterior of a supernova light curve parametric model. We show that amortization with a recurrent neural network is significantly faster than MCMC while providing competitive estimates of the predictive distribution. To the best of our knowledge, this is the first time this fast amortized framework is applied to astronomical light curves. This approach will be essential when estimating the posterior of astrophysical parameters for terabytes of data per night that next-generation telescopes will produce.

Alexis Sánchez · Pablo And Huijse · Francisco Förster · Guillermo Cabrera-Vives
-
(Poster) [ Visit Poster at Spot H1 in Virtual World ]
While Bayesian Optimization (BO) has emerged as sample-efficient optimization method for accelerating drug discovery, it has rarely been applied to the process optimization of pharmaceutical manufacturing, which traditionally has relied on human-intuition, along with trial-and-error and slow cycles of learning. The combinatorial and hierarchical complexity of such process control also introduce challenges related to high-dimensional design spaces and requirements of larger scale observations, in which BO has typically scaled poorly. In this paper, we use penicillin production as a case study to demonstrate the efficacy of BO in accelerating the optimization of typical pharmaceutical manufacturing processes. To overcome the challenges raised by high dimensionality, we apply a trust region BO approach (TuRBO) for global optimization of penicillin yield and empirically show that it outperforms other BO and random baselines. We also extend the study by leveraging BO in the context of multi-objective optimization, allowing us to further evaluate the trade-offs between penicillin yield, production time, and CO$_2$ emission as by-product. Through quantifying the performance of BO across high-dimensional and multi-objective optimization on drug production processes, we hope to popularize application of BO in this field, and encourage closer collaboration between machine learning and broader scientific communities.
Qiaohao Liang
-
(Poster) [ Visit Poster at Spot H0 in Virtual World ]   
The gravitational $N$-body problem, which is fundamentally important in astrophysics to predict the motion of $N$ celestial bodies under the mutual gravity of each other, is usually solved numerically because there is no known general analytical solution for $N>2$. Can an $N$-body problem be solved accurately by a neural network (NN)? Can a NN observe long-term conservation of energy and orbital angular momentum? Inspired by Wistom \& Holman (1991) symplectic map, we present a neural $N$-body integrator for splitting the Hamiltonian into a two-body part, solvable analytically, and an interaction part that we approximate with a NN. Our neural symplectic $N$-body code integrates a general three-body system at $\mathcal{O}(N)$ complexity for $10^{5}$ steps without diverting from the ground truth dynamics obtained from a traditional $N$-body integrator. Moreover, it exhibits good inductive bias by successfully predicting the dynamical evolution of $N$-body systems that are no part of the training set.
Maxwell Cai · Simon Portegies Zwart · Damian Podareanu
-
(Poster) [ Visit Poster at Spot G3 in Virtual World ]

High-performance computing (HPC) applications are frequently communication-bound and so are unable to take advantage of the full extent of compute resources available on a node. Examples abound in scientific computing, where large-scale partial differential equations (PDEs) are solved on hundreds to thousands of nodes. The vast majority of these problems rely on mesh-based discretization techniques and on the calculation of fluxes across element boundaries. The mathematical expression for those fluxes is based on data that is available in the local memory and on neighboring data transferred from another compute node. That data transfer can account for a significant percentage of the simulation time and energy consumption. We present algorithmic approaches for replacing data transfers with local computations, potentially leading to a reduction in simulation cost and avenues for kernel acceleration that would otherwise not be worthwhile. The communication cost can be reduced by up to 50%, with limited impact on physical simulation accuracy.

Laurent White · Ganesh Dasika
-
(Poster) [ Visit Poster at Spot G2 in Virtual World ]   

The transport of traffic flow can be modeled by the advection equation. Finite difference and finite volumes methods have been used to numerically solve this hyperbolic equation on a mesh. Advection has also been modeled discretely on directed graphs using the graph advection operator [4, 18]. In this paper, we first show that we can reformulate this graph advection operator as a finite difference scheme. We then propose the Directed Graph Advection Matérn Gaussian Process (DGAMGP) model that incorporates the dynamics of this graph advection operator into the kernel of a trainable Matérn Gaussian Process to effectively model traffic flow and its uncertainty as an advective process on a directed graph.

Nadim Saad · Danielle Maddix · Bernie Wang
-
(Poster) [ Visit Poster at Spot G1 in Virtual World ]   

The absorption of light by molecules in the atmosphere of Earth is a complication for ground-based observations of astrophysical objects. Comprehensive information on various molecular species is required to correct for this so called telluric absorption. We present a neural network autoencoder approach for extracting a telluric transmission spectrum from a large set of high-precision observed solar spectra from the HARPS-N radial velocity spectrograph. We accomplish this by reducing the data into a compressed representation, which allows us to unveil the underlying solar spectrum and simultaneously uncover the different modes of variation in the observed spectra relating to the absorption of H2O and O2 in the atmosphere of Earth. We demonstrate how the extracted components can be used to remove H2O and O2 tellurics in a validation observation with similar accuracy and at less computational expense than a synthetic approach with molecfit.

Rune Kjærsgaard · Line Clemmensen
-
(Poster) [ Visit Poster at Spot G0 in Virtual World ]   

We present a super-resolution model for an advection-diffusion process with limited information. While most of the super-resolution models assume high-resolution (HR) ground-truth data in the training, in many cases such HR dataset is not readily accessible. Here, we show that a Recurrent Convolutional Network trained with physics-based regularizations is able to reconstruct the HR information without having the HR ground-truth data. Moreover, considering the ill-posed nature of a super-resolution problem, we employ the Recurrent Wasserstein Autoencoder to model the uncertainty.

Chulin Wang · Kyongmin Yeo · Andres Codas · Xiao Jin · Bruce Elmegreen · kleinl
-
(Poster) [ Visit Poster at Spot F3 in Virtual World ]   

Raw light curve data from exoplanet transits is too complex to naively apply traditional outlier detection methods. We propose an architecture which estimates a latent representation of both the main transit and residual deviations with a pair of variational autoencoders. We show, using two fabricated datasets, that our latent representations of anomalous transit residuals are significantly more amenable to outlier detection than raw data or the latent representation of a traditional variational autoencoder. We then apply our method to real exoplanet transit data. Our study is the first which automatically identifies anomalous exoplanet transit light curves. We additionally release three first-of-their-kind datasets to enable further research.

Christoph Hönes · Benjamin K Miller
-
(Poster) [ Visit Poster at Spot F2 in Virtual World ]   

The identification of gravitationally lensed supernovae in modern astronomical datasets is a needle-in-a-haystack problem with dramatic scientific implications: discovered systems can be used to directly measure and resolve the current tension on the value of the expansion rate of the Universe today. We hypothesize that the image-based features of the gravitational lensing and the temporal-based features of the time-varying brightness are equally important in classifications. We therefore develop a deep learning technique that utilizes long short-term memory cells for the time-varying brightness of astronomical systems and convolutional layers for the raw images of astronomical systems simultaneously, and then concatenates the feature maps with multiple fully connected layers. This novel approach achieves a receiver operating characteristic area under curve of 0.97 on simulated astronomical data and more importantly outperforms standalone versions of its recurrent and convolutional constituents. We find that combining recurrent and convolutional layers within one coherent network architecture allows the network to optimally weight and aggregate the temporal and image features to yield a promising tool for lensed supernovae identification.

Robert Morgan · Brian Nord
-
(Poster) [ Visit Poster at Spot F1 in Virtual World ]

Neural-network-based surrogate models, which replace (parts of) a physics-based simulator, are attractive for their efficiency, yet they suffer from a lack of extrapolation capability. Focusing on the wave equation, we investigate the use of several physics-based regularization terms in the loss function as a way to increase the extrapolation accuracy, together with assessing the impact of a term that conditions the neural network to weakly satisfy the boundary conditions. These regularization terms do not require any labeled data. By gradually incorporating the regularization terms while training, we achieve a more than 5X reduction in extrapolation error compared to a baseline (i.e., physics-less) neural network that is trained with the same set of labeled data. We map out future research directions, and provide some insights about leveraging the trained neural-network state for devising sampling strategies.

Ganesh Dasika · Laurent White
-
(Poster) [ Visit Poster at Spot F0 in Virtual World ]

Recently, there has been a renewed interest in returning to the Moon, with many planned missions targeting the south pole. This region is of high scientific and commercial interest, mostly due to the presence of water-ice and other volatiles which could enable our sustainable presence on the Moon and beyond. In order to plan safe and effective crewed and robotic missions, access to high-resolution (<0.5 m) surface imagery is critical. However, the overwhelming majority (99.7%) of existing images over the south pole have spatial resolutions >1 m. In order to obtain better images, the only currently available way is to launch a new satellite mission to the Moon with better equipment to gather more precise data. In this work we develop an alternative that can be used directly on previously gathered data and therefore saving a lot of resources. It consist of a single image super-resolution (SR) approach based on generative adversarial networks that is able to super-resolve existing images from 1 m to 0.5 m resolution, unlocking a large catalogue of images (∼50,000) for a more accurate mission planning in the region of interest for the upcoming missions. We show that our enhanced images reveal previously unseen hazards such as small craters and boulders, allowing safer traverse planning. Our approach also includes uncertainty estimation, which allows mission planners to understand the reliability of the super-resolved images.

Jose Delgado-Centeno · Paula Harder · Ben Moseley · Valentin Bickel · Siddha Ganju · Miguel Olivares · Freddie Kalaitzis
-
(Poster) [ Visit Poster at Spot E3 in Virtual World ]   

The ARIANNA experiment is a detector designed to record radio signals created by high-energy neutrino interactions in the Antarctic ice. Because of the low neutrino rate at high energies, the physics output is limited by statistics. Hence, an increase in detector sensitivity significantly improves the interpretation of data and offers the ability to probe new physics. The trigger thresholds of the detector are limited by the rate of triggering on unavoidable noise. A real-time noise rejection algorithm enables the thresholds to be lowered substantially and increases the sensitivity of the detector by up to a factor of two compared to the current ARIANNA capabilities. Deep learning discriminators based on Fully Connected Neural Networks (FCNN) and Convolutional Neural Networks (CNN) are evaluated for their ability to reject a high percentage of noise events (while retaining most of the neutrino signal) and to classify events quickly. In particular, we describe a CNN trained on Monte Carlo data that runs on the current ARIANNA microcontroller and retains 95% of the neutrino signal at a noise rejection factor of 10^5.

Astrid Anker
-
(Poster) [ Visit Poster at Spot E2 in Virtual World ]   

High-energy neutrinos (above a few \SI{e16}{eV}) can be detected cost effective with a spare array of radio detector stations installed in polar ice sheets. The technology has been explored successfully in pilot-arrays. A large radio detector is currently being constructed in Greenland with the potential to measure the first cosmogenic neutrino, and an order-of-magnitude more sensitive detector is being planned with IceCube-Gen2. We present the first end-to-end reconstruction of the neutrino energy from radio detector data. NuRadioMC was used to create a large data set of 40 million events of expected radio signals that are generated via the Askaryan effect following a neutrino interaction in the ice for a broad range of neutrino energies between \SI{100}{PeV} and \SI{10}{EeV}. We simulated the voltage traces that would be measured by the five antennas of a shallow detector station in the presence of noise. We designed and trained a deep neural network to determine the shower energy directly from the simulated experimental data and achieve a resolution better than a factor of two (STD < 0.3 in log10(E)) which is below the irreducible uncertainty from inelasticity fluctuations. We present the model architecture and study the dependence of the resolution on event parameters. This method will enable Askaryan detectors to measure the neutrino energy.

Stephen McAleer · Christian Glaser · Pierre Baldi
-
(Poster) [ Visit Poster at Spot E1 in Virtual World ]

Astrometric lensing has recently emerged as a promising avenue for characterizing the population of dark matter clumps---subhalos---in our Galaxy. Leveraging recent advances in simulation-based inference and neural network architectures, we introduce a novel method to look for global dark matter-induced lensing signatures in astrometric datasets. Our method shows significantly greater sensitivity to a cold dark matter population compared to existing approaches, establishing machine learning as a powerful tool for characterizing dark matter using astrometric data.

Siddharth Mishra-Sharma
-
(Poster) [ Visit Poster at Spot E0 in Virtual World ]   
Dynamical systems are ubiquitous and are often modeled using a non-linear system of governing equations. Numerical solution procedures for many dynamical systems have existed for several decades, but can be slow due to high-dimensional state space of the dynamical system. Thus, deep learning-based reduced order models (ROMs) are of interest and one such family of algorithms along these lines are based on the Koopman theory. This paper extends a recently developed adversarial Koopman model (Balakrishnan \& Upadhyay, arXiv:2006.05547) to stochastic space, where the Koopman operator applies on the probability distribution of the latent encoding of an encoder. Specifically, the latent encoding of the system is modeled as a Gaussian, and is advanced in time by using an auxiliary neural network that outputs two Koopman matrices $K_{\mu}$ and $K_{\sigma}$. Adversarial and gradient losses are used and this is found to lower the prediction errors. A reduced Koopman formulation is also undertaken where the Koopman matrices are assumed to have a tridiagonal structure, and this yields predictions comparable to the baseline model with full Koopman matrices. The efficacy of the stochastic Koopman model is demonstrated on different test problems in chaos, fluid dynamics, combustion, and reaction-diffusion models. The proposed model is also applied in a setting where the Koopman matrices are conditioned on other input parameters for generalization and this is applied to simulate the state of a Lithium-ion battery in time. The Koopman models discussed in this study are very promising for the wide range of problems considered.
Kaushik Balakrishnan · devesh upadhyay
-
(Poster) [ Visit Poster at Spot D3 in Virtual World ]   

Automated model discovery of partial differential equations (PDEs) usually considers a single experiment or dataset to infer the underlying governing equations. In practice, experiments have inherent natural variability in parameters, initial and boundary conditions that cannot be simply averaged out. We introduce a randomised adaptive group Lasso sparsity estimator to promote grouped sparsity and implement it in a deep learning based PDE discovery framework. It allows to create a learning bias that implies the a priori assumption that all experiments can be explained by the same underlying PDE terms with potentially different coefficients. Our experimental results show more generalizable PDEs can be found from multiple highly noisy datasets, by this grouped sparsity promotion rather than simply performing independent model discoveries.

Georges Tod · Gert-Jan Both · Remy Kusters
-
(Poster) [ Visit Poster at Spot D2 in Virtual World ]

In high-background or calibration measurements with cryogenic particle detectors, a significant share of the exposure is lost due to pile-up of recoil events. We propose a method for the separation of pile-up events with an LSTM neural network and evaluate its performance on an exemplary data set. Despite a non-linear detector response function, we can reconstruct the ground truth of a severely distorted energy spectrum reasonably well.

Felix Wagner
-
(Poster) [ Visit Poster at Spot D1 in Virtual World ]

Particle-In-Cell (PIC) methods are frequently used for kinetic, high-fidelity simulations of plasmas. Implicit formulations of PIC algorithms feature strong conservation properties, up to numerical round-off errors, and are not subject to time-step limitations which make them an attractive candidate to use in simulations fusion plasmas. Currently they remain prohibitively expensive for high-fidelity simulation of macroscopic plasmas. We investigate how amortized solvers can be incorporated with PIC methods for simulations of plasmas. Incorporated into the amortized solver, a neural network predicts a vector space that entails an approximate solution of the PIC system. The network uses only fluid momments and the electric field as input and its output is used to augment the vector space of an iterative linear solver. We find that this approach reduces the average number of required solver iterations by about 25% when simulating electron plasma oscillations. This novel approach may allow to accelerate implicit PIC simulations while retaining all conservation laws and may also be appropriate for multi-scale systems.

Ralph Kube · Randy Churchill
-
(Poster) [ Visit Poster at Spot D0 in Virtual World ]

The Compact Muon Solenoid (CMS) detector is one of two general-purpose detectors on the energy frontier of particle physics at the CERN Large Hadron Collider (LHC). Products of proton-proton collisions at a center of mass energy of 13 TeV are reconstructed in the CMS detector to probe the standard model of particle physics, and to search for processes beyond the standard model. The development of precision algorithms for this reconstruction is therefore a key objective in optimizing the precision of all physics results at CMS. While machine learning techniques are now prevalent at CMS for these tasks, they have largely relied on high-level human-engineered input features. However, much of the disruptive impact of machine learning in industry has been realized by bypassing human feature engineering and instead training deep learning algorithms on low-level data. We have developed a novel machine learning architecture based on dynamic graph neural networks which allows regression directly on low-level detector hits, and we have applied this model to the calibration of electron and photon energies in CMS. In this work, the performance of our new architecture is shown on electrons used in the calibration of the CMS detector, where we obtain an improvement in energy resolution by as much as 10% with respect to the previous state-of-the-art reconstruction method.

Simon Rothman
-
(Poster) [ Visit Poster at Spot C3 in Virtual World ]   

Many astrophysical analyses depend on estimates of redshifts (a proxy for distance) determined from photometric (i.e., imaging) data alone. Inaccurate estimates of photometric redshift uncertainties can result in large systematic errors. However, probability distribution outputs from many photometric redshift methods do not follow the frequentist definition of a Probability Density Function (PDF) for redshift --- i.e., the fraction of times the true redshift falls between two limits z1 and z2 should be equal to the integral of the PDF between these limits. Previous works have used the global distribution of Probability Integral Transform (PIT) values to re-calibrate PDFs, but offsetting inaccuracies in different regions of feature space can conspire to limit the efficacy of the method. We leverage a recently developed regression technique that characterizes the local PIT distribution at any location in feature space to perform a local re-calibration of photometric redshift PDFs. Though we focus on an example from astrophysics, our method can produce PDFs which are calibrated at all locations in feature space for any use case.

Biprateep Dey · Ann Lee · Rafael Izbicki · David Zhao
-
(Poster) [ Visit Poster at Spot C2 in Virtual World ]

The recent increase in frequency and severity of natural disasters is a clear indication of an immediate need to address the cascading impacts of climate change. However, climate change cannot be measured directly. In a weather cycle, river discharge is the end result of any hydrologic process, and thus directly measures the effect of two major parameters used to measure impacts of climate change; Temperature and Precipitation. Unlike current methods that are able to infer climate change patterns over a long period of time, river discharge is an effective proxy for measuring the effects of climate change within a short period of time. Unfortunately, current statistical and physics-based models neither take full advantage of hydro-meteorological information encoded in over 100 years of historical hydrologic data nor are they applicable on a global scale. In this work, we train Long Short Term Memory (LSTM) Recurrent Neural Network models on satellite observations and daily discharge from gauged basins. Our models outperform the latest state-of-the-art process-based hydrology models with Kling-Gupta and Nash-Sutcliffe Efficiency scores of 85\% and 81\% respectively in ungauged basins with limited to no-existing data. This will allow accurate predictions in the majority of the global river basins that do not have in-situ measurements.

Aggrey Muhebwa · Jay Taneja
-
(Poster) [ Visit Poster at Spot C1 in Virtual World ]   

Understanding nonlinear manifolds of scientific data extracted via autoencoder is important to propel practical uses of non-intrusive reduced-order modeling in the community. We here tackle this matter by visualizing nonlinear autoencoder modes with the aid of mode-decomposing convolutional neural network autoencoder (MD-CNN-AE). The MD-CNN-AE has a customization in the decoder part, which enable us to visualize individual modes extracted through the encoder part. The present demonstration is performed with a three-dimensional flow around a square cylinder at Re_D=300, which possesses complex nonlinear vortical phenomena associated with strong nonlinearities. The results are compared with a conventional linear model order reduction method, i.e, principal component analysis (PCA). The reconstructed fields with MD-CNN-AE hold more energetic information than that with PCA, despite the same number of latent variables. The present results indicate the strong capability of MD-CNN-AE for efficient low-dimensionalization and data compression of three-dimensional flow fields in an interpretable manner.

Kazuto Hasegawa · Kai Fukami · Koji Fukagata
-
(Poster) [ Visit Poster at Spot C0 in Virtual World ]   

Sharpness-aware minimization (SAM) is a novel regularization technique that takes advantage of not only the training error but also the landscape geometry of model parameters to improve model robustness. Although SAM has demonstrated the state-of-the-art (SOTA) performance in image classification, its applicability to physical system is yet to be examined. An ideal testbed is neural-network quantum molecular dynamics (NNQMD) simulations that accurately predict material properties, but the stability of their trajectories is severely limited by thermal noise. In this paper, we demonstrate for the first time that SAM regularizer achieves an order-of-magnitude reduction of the out-of-sample error in potential energy prediction using several SOTA models. Comparing NNQMD datasets with distinct structural characteristics, we found that SAM consistently reduces the out-of-sample error for a crystal dataset at high temperatures with enhanced thermal noise, thus proving the concept of SAM-enhanced robust NNQMD, while no clear trend was observed with an amorphous dataset. Our result suggests a possible correlation between materials structure and model parameter landscape.

Hikaru Ibayashi · Ken-ichi Nomura · Pankaj Rajak · Aravind Krishnamoorthy · Aiichiro Nakano
-
(Poster) [ Visit Poster at Spot B3 in Virtual World ]

Modeling the subgrid-scale dynamics of reduced models is a long standing open problem that finds application in ocean, atmosphere and climate predictions where direct numerical simulation (DNS) is impossible. While neural networks (NNs) have already been applied to a range of three-dimensional problems with success, the backward energy transfer of two-dimensional flows still remains a stability issue for trained models. We show that learning a model jointly with the dynamical solver and a meaningful \textit{a posteriori}-based loss function lead to stable and realistic simulations when applied to quasi-geostrophic turbulence.

Hugo Frezat · ronan fablet · Redouane Lguensat
-
(Poster) [ Visit Poster at Spot B2 in Virtual World ]   

Extreme-ultraviolet images taken by the Atmospheric Imaging Assembly make it possible to use deep vision techniques in the prediction of solar wind speed - a difficult, high-impact, and unsolved problem. This study uses vision transformers and a set of methodological and modelling improvements to deliver an 11.1% lower RMSE error, and a 17.4% higher prediction correlation compared to the previous state of the art models. Furthermore, our analysis shows that vision transformers combined with our pipeline consistently outperform convolutional alternatives. Additionally, the best vision transformer outperforms the best convolutional model by 1.8% in RMSE and 2.6% in correlation with the ground truth solar wind speed.

Filip Svoboda · Edward Brown
-
(Poster) [ Visit Poster at Spot B1 in Virtual World ]   

Computing the energy of molecules plays a critical role for molecule design. Classical ab-initio methods using Density Functional Theory (DFT) often suffers from scalability issues due to its extreme computing cost. A growing number of data-driven neural-net-based DFT surrogate models have been proposed to address this challenge. After trained on the ab-initio reference data, these models significantly accelerate the energy prediction of molecular systems, circumventing numerically solving the Schrödinger equation. However, the performance of these models is often limited to the scope within the training data distribution. It is also challenging to discover physical insights from their prediction due to the lack of interpretability of neural networks. In this paper, we aim to design a physics-ML hybrid DFT surrogate model, which is both physically interpretable and generalizable to beyond the training data distribution. To achieve these goals, we propose a physics-driven approach to fit the energy to an equation combining Coulomb and Lennard-Jones potentials by first predicting their sub-parameters, then computing the energy product by the equation. Our experimental results show the effectiveness of the proposed approach in its performance, generalizability, and interpretability.

Youngwoo Cho · Marco Yi · Jaegul Choo · Joonseok Lee · Sookyung Kim
-
(Poster) [ Visit Poster at Spot B0 in Virtual World ]   

Transformers are state-of-the-art deep learning models that are composed of stacked attention and point-wise, fully connected layers designed for handling sequential data. Transformers are not only ubiquitous throughout Natural Language Processing (NLP), but also have recently inspired a new wave of Computer Vision (CV) applications research. In this work, a Vision Transformer (ViT) is fine-tuned to predict the state variables of 2-dimensional Ising model simulations. Our experiments show that ViT outperforms state-of-the-art Convolutional Neural Networks (CNN) when using a small number of microstate images from the Ising model corresponding to various boundary conditions and temperatures. This work explores the possible of applications of ViT to other simulations and introduces interesting research directions on how attention maps can learn the underlying physics governing different phenomena.

Onur Kara · Arijit Sehanobish · HECTOR CORZO
-
(Poster) [ Visit Poster at Spot A3 in Virtual World ]

We apply a physics-informed neural network framework for inversely retrieving the effective material parameters of a two-dimensional metasurface from its scattered field(s). We show that by employing a loss function based on the Helmholtz wave equation, we can model the performance of a metamaterial disc-shaped structure and split-ring resonator with great promise and demonstrate the dependance of resonant behavior on the homogenized electric permittivity distribution profile generated by our network.

Parama Pal · Prajith P
-
(Poster) [ Visit Poster at Spot A2 in Virtual World ]   

We present a new machine learning library for computing metrics of string compactification spaces. We benchmark the performance on Monte-Carlo sampled integrals against previous numerical approximations and find that our neural networks are more sample- and computation-efficient. We are the first to provide the possibility to compute these metrics for arbitrary, user-specified shape and size parameters of the compact space and observe a linear relation between optimization of the partial differential equation we are training against and vanishing Ricci curvature.

Robin Schneider
-
(Poster) [ Visit Poster at Spot A1 in Virtual World ]   

In this report, we present a deep learning framework termed the Electron Correlation Potential Neural Network (eCPNN) that can learn succinct and compact potential functions. These functions can effectively describe the complex instantaneous spatial correlations among electrons in many--electron atoms. The eCPNN was trained in an unsupervised manner with limited information from Full Configuration Interaction (FCI) one--electron density functions within predefined limits of accuracy. Using the effective correlation potential functions generated by eCPNN, we can predict the total energies of each of the studied atomic systems with a remarkable accuracy when compared to FCI energies.

HECTOR CORZO · Arijit Sehanobish · Onur Kara
-
(Poster) [ Visit Poster at Spot A0 in Virtual World ]   

X-ray polarimetry will soon open a new window on the high energy universe with the launch of NASA's Imaging X-ray Polarimetry Explorer (IXPE). Polarimeters are currently limited by their track reconstruction algorithms, which use linear estimators and do not consider individual event quality. We present a modern deep learning method for maximizing the sensitivity of X-ray telescopic observations with imaging polarimeters, with a focus on the gas pixel detectors (GPDs) to be flown on IXPE. We use a weighted maximum likelihood combination of predictions from a deep ensemble of ResNets, trained on Monte Carlo event simulations. We derive and apply the optimal event weighting for maximizing the signal-to-noise ratio (SNR) in track reconstruction algorithms. For typical power-law source spectra, our method improves on the current state of the art, providing a ~40% decrease in required exposure times.

Lawrence Peirson
-
(Poster) [ Visit Poster at Spot J3 in Virtual World ]

Computer-aided retrosynthesis accelerate and innovate the process of molecule and material design, allowing the discovery of new pathways and automating part of the overall development process for drugs and materials. Current machine-learning methods applied to retrosynthesis are limited by their lack of control when generating single-step reactions as they rely on sampling or beam search algorithm. In this work, we apply vector quantized representation learning [1] to learn reaction classes along with retrosynthetic predictions. We represent each reaction class with a vector allowing us to condition the retrosynthetic prediction. We show that learning reaction classes increases control as well as generating more diverse predictions than a baseline model. Our results are a significant step forward in the development of multistep retrosynthesis prediction.

Théophile Gaudin · Animesh Garg · Alan Aspuru-Guzik
-
(Poster) [ Visit Poster at Spot J2 in Virtual World ]

We introduce a method to simulate parametrized quantum circuits, an architecture behind many practical algorithms on near-term hardware, focusing on the Quantum Approximate Optimization Algorithm (QAOA). A neural-network parametrization of the many-qubit wave function is used, reaching 54 qubits at 4 QAOA layers, approximately implementing 324 RZZ gates and 216 RX gates without requiring large-scale computational resources. our approach can be used to provide accurate QAOA simulations at previously unexplored parameter values and to benchmark the next generation of experiments in the Noisy Intermediate-Scale Quantum (NISQ) era.

Matija Medvidović · Giuseppe Carleo
-
(Poster) [ Visit Poster at Spot J1 in Virtual World ]

Computational Fluid Dynamics solvers have benefited from strong developments for decades, being critical for many scientific and industrial applications. The downside of their great accuracy is a requirement for tremendous computational resources. In this short article, we present our on-going work to design a data-driven deep surrogate: a neural network that is trained to provide a quality solution to the Navier-Stokes equations for a given domain, initial and boundary conditions. The resulting surrogate is expected to substitute traditional solvers in a limited range of input conditions, and enable interactive parameter exploration, sensibility analysis, and digital twins. Some approaches to build data-driven surrogates mimic the solver iterative process, being trained to compute the fluid transition from a time step t to t+1. Other surrogates are trained to directly produce a time step t, and are called "direct time". Surrogates also differ in their approach to space discretization. If the mesh is a regular grid, CNNs can be used. Irregular meshes or particle-based approaches are more challenging, and can be addressed through some variations of graph neural networks (GNN). Our contribution is a novel direct time GNN architecture for irregular meshes. It consists of a succession of graphs of increasing size connected by spline convolutions. Early experiments with the Von Karman’s vortex street benchmark show that our architecture achieves small generalization errors (RMSE at about 10^-3) and is not subject to error accumulation along the trajectory.

Lucas Meyer · Bruno Raffin
-
(Poster) [ Visit Poster at Spot J0 in Virtual World ]

We study the use of normalizing flows to represent the field-level probability density distribution of random fields in cosmology such as the matter and radiation distribution. We evaluate the performance of the real NVP flow for sampling of Gaussian and near-Gaussian random fields, and N-body simulations, and check the quality of samples with different statistics such as power spectrum and bispectrum estimators. We explore aspects of these flows that are specific to cosmology, such as flowing from a physical prior distribution and evaluating the density estimation results in the analytically tractable correlated Gaussian case.

Adam Rouhiainen
-
(Poster) [ Visit Poster at Spot I3 in Virtual World ]

Sub-seasonal forecasting (SSF) is the prediction of key climate variables such as temperature and precipitation on a 2-week to 2-month time horizon. Skillful SSF would have substantial societal value in areas such as agricultural productivity, water resource management, and emergency planning for droughts and wildfires. Despite its societal importance, SSF has stayed a challenging problem and mainly relies on physics-based dynamical models. Meanwhile, recent studies have shown the potential of machine learning (ML) models to advance SSF. In this paper, we show that suitably incorporating dynamical model forecasts as inputs to ML models can substantially improve their forecasting performance. The SSF dataset and codebase constructed for the work will be made available along with the paper.

Sijie He · Xinyan Li · Laurie Trenary · Benjamin Cash · Timothy DelSole · Arindam Banerjee
-
(Poster) [ Visit Poster at Spot I2 in Virtual World ]   

In this work, we study the emergence of sparsity and multiway structures in second-order characterizations of dynamical processes governed by partial differential equations (PDEs). We consider several state-of-the-art multiway covariance and inverse covariance (precision) matrix estimators and examine their pros and cons in terms of accuracy and interpretability in the context of physics-driven forecasting using the ensemble Kalman filter (EnKF). In particular, we show that multiway data generated from the Poisson and the convection-diffusion types of PDEs can be accurately tracked via EnKF when integrated with appropriate covariance and precision matrix estimators.

Wayne Wang · Alfred Hero
-
(Poster) [ Visit Poster at Spot I1 in Virtual World ]
Supersymmetric quantum gauge theories are important mathematical tools in high energy physics.
As an example, supersymmetric matrix models can be used as a holographic description of quantum black holes.
The wave function of such supersymmetric gauge theories is not known and it is challenging to obtain with traditional techniques.
We employ a neural quantum state ansatz for the wave function of a supersymmetric matrix model and use a variational quantum Monte Carlo approach to discover the ground state of the system.
We discuss the difficulty of including bosonic particles and fermionic particles, as well as gauge degrees of freedom.
Enrico Rinaldi
-
(Poster) [ Visit Poster at Spot I0 in Virtual World ]

The large-scale structure of the Universe is the direct consequence of its evolution over billions of years. The observations of this large-scale structure in terms of galaxy redshift surveys contains valuable cosmological information and in order to extract that information, we need to compare these observations to corresponding theory predictions from cosmological simulations, whose generation in itself is a very computationally intensive feat. This work uses deep convolutional neural networks to simulate the large-scale structure of the Universe and generate a typical cosmological simulation orders of magnitude faster than the standard N-body simulations within an accuracy of ~0.1% on the most common cosmological summary statistics. The most important feature of our model is that it extrapolates extremely well on universes with entirely different cosmologies than the one it has been trained on. The use of such an approach will be particularly useful in the near future to compare theory with predictions, to generate mock galaxy catalogs, to compute covariance matrices and to optimize observational strategies.

Neerav Kaushal · Elena Giusarma
-
(Poster) [ Visit Poster at Spot H3 in Virtual World ]   

Interacting particle or agent systems that display a rich variety of collection motions are ubiquitous in science and engineering. The fundamental and challenging goals are to infer individual interaction rules that yield collective behaviors and establish the governing equations. In this paper, we study the data-driven discovery of second-order interacting particle systems with distance-based interaction laws, which are known to have the capability to reproduce a rich variety of collective patterns. We propose a learning approach that models the latent interaction function as a Gaussian process, which can simultaneously fulfill two inference goals: one is the nonparametric inference of interaction function with the pointwise uncertainty quantification, and the other one is the inference of unknown parameters in the non-collective forces of the system. We test the learning approach on Dorsogma model and numerical results demonstrate the effectiveness.

Sui Tang
-
(Poster) [ Visit Poster at Spot H2 in Virtual World ]

In this study, a novel physics-informed neural network (PINN) is proposed to allow efficient training with improved accuracy. PINNs typically constrain their training loss function with differential equations to ensure outputs obey underlying physics. These differential operators are typically computed via automatic differentiation (AD), but this can fail with insufficient collocation points. Hence, the idea of coupling both AD and numerical differentiation (ND) is employed. The proposed coupled-automatic-numerical differentiation scheme (can-PINN) strongly links collocation points, thus enabling efficient training while being more accurate than simply using ND. As a demonstration, two instantiations of can-PINN were derived for the incompressible Navier-Stokes equations and applied to modeling of lid-driven flow in a cavity. Results show that can-PINNs can achieve very good accuracy even when the corresponding AD-based PINN fails.

Pao-Hsiung Chiu · Chin Chun Ooi · Yew Soon Ong
-
(Poster) [ Visit Poster at Spot H1 in Virtual World ]

We introduce a novel method for constructing symmetry-preserving attention networks which reflect the natural invariances of the jet-parton assignment problem to efficiently find assignments without evaluating all permutations. This general approach is applicable to arbitrarily complex configurations and significantly outperforms current methods, improving reconstruction efficiency between 19% - 35% on benchmark problems while decreasing inference time by two to five orders of magnitude, making many important and previously intractable cases tractable.

Alex Shmakov · Shih-chieh Hsu · Pierre Baldi
-
(Poster) [ Visit Poster at Spot H0 in Virtual World ]

Open quantum systems can undergo dissipative phase transitions, and their critical behavior can be used to enhance, e.g., the fidelity of superconducting qubit readout measurements, a central problem toward the creation of reliable quantum hardware. For example, a recently introduced measurement protocol, named ``critical parametric quantum sensing'', uses the parametric (two-photon driven) Kerr resonator's driven-dissipative phase transition to reach single-qubit detection fidelity of 99.9\%. These classification algorithms are applied to the time series data of weak quantum measurements (homodyne detection) of a circuit-QED implementation of the Kerr resonator coupled to a superconducting qubit. This demonstrates how machine learning methods enable a fast and reliable measurement protocol in critical open quantum systems.

Enrico Rinaldi
-
(Poster) [ Visit Poster at Spot G3 in Virtual World ]   

We propose a continuous normalizing flow for sampling from the high-dimensional probability distributions of Quantum Field Theories in Physics. In contrast to the deep architectures used so far for this task, our proposal is based on a shallow design and incorporates the symmetries of the problem. We test our model on the ϕ⁴ theory, showing that it systematically outperforms a realNVP baseline in sampling efficiency, with the difference between the two increasing for larger lattices. On the largest lattice we consider, of size 32 x 32, we improve a key metric, the effective sample size, from 1% to 66% w.r.t. the realNVP baseline.

Pim de Haan · Roberto Bondesan
-
(Poster) [ Visit Poster at Spot G2 in Virtual World ]

The prediction of quantum mechanical properties is historically plagued by a trade-off between accuracy and speed. Machine learning potentials have previously shown great success in this domain, reaching increasingly better accuracy while maintaining computational efficiency comparable with classical force fields. In this work we propose a novel equivariant Transformer architecture, outperforming state-of-the-art on MD17 and ANI-1. Through an extensive attention weight analysis, we gain valuable insights into the black box predictor and show differences in the learned representation of conformers versus conformations sampled from molecular dynamics or normal modes. Furthermore, we highlight the importance of datasets including off-equilibrium conformations for the evaluation of molecular potentials.

Philipp Thölke · Gianni De Fabritiis
-
(Poster) [ Visit Poster at Spot G1 in Virtual World ]

Many Bayesian inference problems in cosmology involve complex models. Despite the fact that these models have been meticulously designed, they can lead to intractable likelihood and each forward simulation itself can be computationally expensive, thus making the inverse problem of learning the model parameters a challenging task. In this paper, we develop an approximate model for the 3D matter power spectrum, P(k,z), which is a central quantity in a weak lensing analysis. An important output of this approximate model, often referred to as surrogate model or emulator, are the first and second derivatives with respect to the input cosmological parameters. Without the emulator, the calculation of the derivatives requires multiple calls of the simulator, that is, the accurate Boltzmann solver, CLASS. We illustrate the application of the emulator in the calculation of different weak lensing and intrinsic alignment power spectra and we also demonstrate its performance on a toy simulated weak lensing dataset.

Arrykrishna Mootoovaloo
-
(Poster) [ Visit Poster at Spot G0 in Virtual World ]   

Molecular property prediction is one of the fastest-growing applications of deep learning with critical real-world impacts. Including 3D molecular structure as input to learned models improves their predictions for many molecular properties. However, this information is infeasible to compute at the scale required by most real-world applications. We propose pre-training a model to understand the geometry of molecules given only their 2D molecular graph. Using methods from self-supervised learning, we maximize the mutual information between a 3D summary vector and the representations of a Graph Neural Network (GNN) such that they contain latent 3D information. During fine-tuning on molecules with unknown geometry, the GNN still generates implicit 3D information and can use it to inform downstream tasks. We show that 3D pre-training provides significant improvements for a wide range of molecular properties, such as a 22% average MAE reduction on eight quantum mechanical properties. Crucially, the learned representations can be effectively transferred between datasets with vastly different molecules.

Hannes Stärk · Gabriele Corso · Christian Dallago · Stephan Günnemann · Pietro Lió
-
(Poster) [ Visit Poster at Spot F3 in Virtual World ]

Improving the predictive capability of molecular properties in ab-initio simulations is essential for advanced material discovery. Despite recent progress making use of machine learning, utilizing deep neural networks to improve quantum chemistry modelling remains severely limited by the scarcity and heterogeneity of appropriate experimental data. Here we show how training a neural network to replace the exchange-correlation functional within a fully-differentiable three-dimensional Kohn-Sham density functional theory (DFT) simulation can greatly improve its accuracy and generalizability. Using only eight experimental data points on diatomic molecules, our trained exchange-correlation networks provided improved predictions of atomization energies across a collection of 104 molecules containing new bonds and atoms that are not present in the training.

Muhammad Firmansyah · Sam Vinko
-
(Poster) [ Visit Poster at Spot F2 in Virtual World ]   

Modern digital cameras and smartphones mostly rely on image signal processing (ISP) pipelines to produce realistic colored RGB images. However, compared to DSLR cameras, low-quality images are usually obtained in many portable mobile devices with compact camera sensors due to their physical limitations. The low-quality images have multiple degradations \ie sub-pixel shift due to camera motion, mosaick patterns due to camera color filter array, low-resolution due to smaller camera sensors, and the rest information are corrupted by the noise. Such degradations limit the performance of current Single Image Super-resolution (SISR) methods in recovering high-resolution (HR) image details from a single low-resolution (LR) image. In this work, we propose a Raw Burst Super-Resolution Iterative Convolutional Neural Network (RBSRICNN) that follows the burst photography pipeline as a whole by a forward (physical) model. The proposed Burst SR scheme solves the problem with classical image regularization, convex optimization, and deep learning techniques, compared to existing black-box data-driven methods. The proposed network produces the final output by an iterative refinement of the intermediate SR estimates. We demonstrate the effectiveness of our proposed approach in quantitative and qualitative experiments that generalize robustly to real LR burst inputs with onl synthetic burst data available for training.

Rao Umer · CHRISTIAN MICHELONI
-
(Poster) [ Visit Poster at Spot F1 in Virtual World ]   

We propose a new model-agnostic search strategy for hints of new fundamental forces motivated by applications in particle physics. It is based on a novel application of neural density estimation to anomaly detection. Our approach, which we call Classifying Anomalies THrough Outer Density Estimation (CATHODE), assumes potential signal events cluster in phase space in a signal region. However, backgrounds due to known processes are also present in the signal region and too large to directly detect such a signal. By training a conditional density estimator on a collection of additional features outside the signal region, interpolating it into the signal region, and sampling from it, we produce a collection of events that follow the background model. We can then train a classifier to distinguish the data from the events sampled from the background model, thereby approaching the optimal anomaly detector. Using the public LHC Olympics R&D data set, we demonstrate that CATHODE nearly saturates the best possible performance, and significantly outperforms other approaches in this bump hunt paradigm.

Joshua Isaacson · Gregor Kasieczka · Benjamin Nachman · David Shih · Manuel Sommerhalder
-
(Poster) [ Visit Poster at Spot F0 in Virtual World ]

Nucleation phenomena commonly observed in our every day life are of fundamental, technological and societal importance in many areas, but some of their most intimate mechanisms remain however to be unravelled. Crystal nucleation, the early stages where the liquid-to-solid transition occurs upon undercooling, initiates at the atomic level on nanometre length and sub-picoseconds time scales and involves complex multidimensional mechanisms with local symmetry breaking that can hardly be observed experimentally in the very details. To reveal their structural features in simulations without a priori, an unsupervised learning approach founded on topological descriptors loaned from persistent homology concepts is proposed. Applied here to a monatomic metal, namely Tantalum (Ta), it shows that both translational and orientational ordering always come into play simultaneously when homogeneous nucleation starts in regions with low five-fold symmetry.

Emilie Devijver · Rémi Molinier
-
(Poster) [ Visit Poster at Spot E3 in Virtual World ]   

Thermophotovoltaics (TPVs) rely on selective thermal emitters to tailor the blackbody radiation at high temperatures into band-matching emission for photovoltaic cells, resulting in power-conversion efficiencies surpassing the Shockley-Queisser limit. The selectivity of the thermal emitter must cover three orders of magnitude range of wavelengths, spreading from visible to the far infrared, which requires the superposition of multiple transformation theories of optics and degrees of freedom anisotropic geometries. It is extremely challenging to realize such high-dimensional complex metasurface design using conventional computational photonics. Here we develop a deep neural network-based Bayesian optimization (DeepBO) framework to screen a 16-dimensional design space of 10^43 candidates, and realize a record-high spectral efficiency of 69% for the TPV emitter. We show that the neural network combined with Bayesian linear regression is an efficient and robust surrogate model which scales linearly with the size of data. We also reveal the underlying physical mechanisms of the geometric design of the TPV emitters using primary component analysis (PCA). We anticipate the DeepBO framework is a useful tool for data-intensive complex geometric design for photonics research community.

Zihan Zhang · Kehang Cui · Jintao Chen
-
(Poster) [ Visit Poster at Spot E2 in Virtual World ]
Direct Numerical Simulations and even wall resolved Large Eddy Simulations remain computationally intractable for full blade span computations when going to large Reynolds numbers. The cost of representing turbulence near the wall motivates the development of wall-modeled LES. The present work proposes the use of Deep Neural Nets (DNN) to link the wall shear stress components to volume data extracted at multiple wall-normal distances $h_{wm}$ and wall-parallel locations. The developed data-driven wall model focuses on the prediction of separation, which is a frequently observed phenomenon in modern low-pressure turbines. The model is trained using a high-fidelity database of the two-dimensional periodic hill flow, which exhibits separation and is affordable to compute on modern clusters.
Margaux Boxho
-
(Poster) [ Visit Poster at Spot E1 in Virtual World ]   

In order to achieve a more virtual design and certification process of jet engines in aviation industry, the uncertainty bounds for computational fluid dynamics have to be known. This work shows the application of a machine learning methodology to quantify the epistemic uncertainties of turbulence models. The underlying method in order to estimate the uncertainty bounds is based on an eigenspace perturbations of the Reynolds stress tensor in combination with random forests.

Marcel Matha
-
(Poster) [ Visit Poster at Spot E0 in Virtual World ]   

Astronomical data is full of holes. While there are many reasons for this missing data, the data can be randomly missing, caused by things like data corruptions or unfavourable observing conditions. We test some simple data imputation methods (Mean, Median, Minimum, Maximum and k-Nearest Neighbours (kNN)), as well as two more complex methods (Multivariate Imputation by using Chained Equation (MICE) and Generative Adversarial Imputation Network (GAIN)) against data where increasing amounts are randomly set to missing. We then use the imputed datasets to estimate the redshift of the galaxies, using the kNN and Random Forest ML techniques. We find that the MICE algorithm provides the lowest Root Mean Square Error and consequently the lowest prediction error, with the GAIN algorithm the next best.

Kieran Luken
-
(Poster) [ Visit Poster at Spot D3 in Virtual World ]

In many areas of science, complex phenomena are modeled by stochastic parametric simulators, often featuring high-dimensional parameter spaces and intractable likelihoods. In this context, performing Bayesian inference can be challenging. In this work, we present a novel method that enables amortized inference over arbitrary subsets of the parameters, without resorting to numerical integration, which makes interpretation of the posterior more convenient. Our method is efficient and can be implemented with arbitrary neural network architectures. We demonstrate the applicability of the method on parameter inference of binary black hole systems from gravitational waves observations.

François Rozet · Gilles Louppe
-
(Poster) [ Visit Poster at Spot D2 in Virtual World ]

Gravitational waves (GWs) detected by the LIGO and Virgo observatories encode descriptions of their astrophysical progenitors. To characterize these systems, physical GW signal models are inverted using Bayesian inference coupled with stochastic samplers---a task that can take O(day) for a typical binary black hole. Several recent efforts have attempted to speed this up by using normalizing flows to estimate the posterior distribution conditioned on the observed data. In this study, we further develop these techniques to achieve results nearly indistinguishable from standard samplers when evaluated on real GW data, with inference times of one minute per event. This is enabled by (i) incorporating detector nonstationarity from event to event by conditioning on a summary of the noise characteristics, (ii) using an embedding network adapted to GW signals to compress data, and (iii) adopting a new inference algorithm that makes use of underlying physical equivariances.

Maximilian Dax · Stephen Green · Jakob Macke · Bernhard Schölkopf
-
(Poster) [ Visit Poster at Spot D1 in Virtual World ]   

In this work we use variational inference to quantify the degree of epistemic uncertainty in model predictions of radio galaxy classification and show that the level of model posterior variance for individual test samples is correlated with measures of human uncertainty when labelling radio galaxies. Using the posterior distributions for individual weights, we show that signal-to-noise ratio (SNR) ranking allows pruning of the fully-connected layers to the level of 40% without significant loss of performance, and that this pruning reduces the predictive uncertainty in the model. Finally we show that, like other work in this field, we experience a cold posterior effect. We examine whether the inclusion of an additional variance term in the loss can compensate for this effect, but find that it does not make a significant difference. We interpret this as the cold posterior effect being due to the overly effective curation of our training sample rather than model misspecification and raise this as a potential issue for Bayesian approaches to radio galaxy classification in future.

Anna Scaife
-
(Poster) [ Visit Poster at Spot D0 in Virtual World ]   

Reconstructing spectral functions from Euclidean Green’s functions is an important inverse problem in physics. The prior knowledge for specific physical systems routinely offers essential regularization schemes for solving the ill-posed problem approximately. Aiming at this point, we propose an automatic differentiation framework as a generic tool for the reconstruction from observable data. We represent the spectra by neural networks and set chi-square as loss function to optimize the parameters with backward automatic differentiation unsupervisedly. In the training process, there is no explicit physical prior embedding into neural networks except the positive-definite form. The reconstruction accuracy is assessed through Kullback–Leibler(KL) divergence and mean square error(MSE) at multiple noise levels. It should be noted that the automatic differential framework and the freedom of introducing regularization are inherent advantages of the present approach and may lead to improvements of solving inverse problem in the future.

Lingxiao Wang · Kai Zhou
-
(Poster) [ Visit Poster at Spot C3 in Virtual World ]

Machine learning methods have enabled new ways of performing inference on high-dimensional datasets modeled using complex simulations. We leverage recent advancements in simulation-based inference in order to characterize the contribution of various modeled components to γ-ray data of the Galactic Center recorded by the Fermi satellite. A specific goal here is to differentiate "smooth" emission, as expected for a dark matter origin, from more "clumpy" emission expected for a population of relatively bright, unresolved astrophysical point sources. Compared to traditional techniques based on the statistical distribution of photon counts, our method based on density estimation using normalizing flows is able to utilize more of the information contained in a given model of the Galactic Center emission, and in particular can perform posterior parameter estimation while accounting for pixel-to-pixel spatial correlations in the γ-ray map.

Siddharth Mishra-Sharma · Kyle Cranmer
-
(Poster) [ Visit Poster at Spot C2 in Virtual World ]   

Stokes inversion techniques are very powerful methods for obtaining information on the thermodynamic and magnetic properties of solar and stellar atmospheres. Most of the existing inversion codes are designed for finding the optimum solution to the nonlinear inverse problem. However, to obtain the location of potentially multimodal solutions, degeneracies, and the uncertainties of each parameter from the inversions, algorithms such as Markov chain Monte Carlo require to evaluate the model thousand of times. Variational methods are a quick alternative by approximating the posterior distribution by a parametrized distribution. In this study, we explore a highly flexible variational method, known as normalizing flows, to return Bayesian posterior probabilities for solar observations. We illustrate the ability of the method using a simple Milne-Eddington model and a complex non-LTE inversion. The training procedure need only be performed once for a given prior parameter space and the resulting network can then generate samples describing the posterior distribution several orders of magnitude faster than existing techniques.

Carlos Díaz Baso
-
(Poster) [ Visit Poster at Spot C1 in Virtual World ]   

Simulations provide the crucial link between theoretical descriptions and experimental observations in the physical sciences. In experimental particle physics, a complex ecosystem of tools exists to describe fundamental processes or the interactions of particles with detectors. The high computational cost associated with producing precise simulations in sufficient quantities --- e.g. for the upcoming data-taking phase of the Large Hadron Collider (LHC) or future colliders --- motivates research into more computationally efficient solutions. Using generative machine learning models to amplify the statistics of a given dataset is an especially promising direction. However, the simulation of realistic showers in a highly granular detector remains a daunting problem due to the large number of cells, values spanning many orders of magnitude, and the overall sparsity of data. This contribution advances the state of the art in two key directions: Firstly, we present a precise generative model for the fast simulation of hadronic showers in a highly granular hadronic calorimeter. Secondly, we compare the achieved simulation quality before and after interfacing with a so-called particle-flow-based reconstruction algorithm. Together, these bring generative models one step closer to practical applications.

Sascha Diefenbacher · Erik Buhmann · Engin Eren · Frank Gaede · Daniel C. Hundhausen · Gregor Kasieczka · William Korcari · Katja Krueger · Peter McKeown · Lennart Rustige
-
(Poster) [ Visit Poster at Spot C0 in Virtual World ]

Accurate measures of lightning activity can be used to predict extreme weather events in advance, saving lives and property. However, the current hand-crafted filtering algorithm for identifying true lightning events from data captured by the GLM onboard NOAA's GOES-R satellites is only 70% accurate, with a 5% false alarm rate. Given the large volume and high temporal resolution, this work applies unsupervised learning techniques in an effort to detect lightning within raw data signals. We present a novel data processing pipeline for the GLM Level 0 products and case study comparison of two approaches to dimensionality reduction and clustering to sort the data by similar patterns. These clusters could then be labeled by a domain expert to accurately distinguish between noise and true lightning events. We demonstrate that autoencoders with graph convolution layers are able to learn a translationally invariant representation of the dataset which allows for k-means clustering to group samples that have similar spatiotemporal patterns together. This work is a first step towards building a machine learning pipeline for improving false event filtering to identify lightning and enhance predictive abilities in the face of increasingly frequent extreme weather events.

Emma Benjaminson · Juan Emmanuel Johnson · Milad Memarzadeh · Nadia Ahmed
-
(Poster) [ Visit Poster at Spot B3 in Virtual World ]

Machine learning tools provide a significant improvement in sensitivity over traditional analyses by exploiting subtle patterns in high-dimensional feature spaces. These subtle patterns may not be well-modeled by the simulations used for training machine learning methods, resulting in an enhanced sensitivity to systematic uncertainties. Contrary to the traditional wisdom of constructing an analysis strategy that is invariant to systematic uncertainties, we study the use of a classifier that is fully aware of uncertainties and their corresponding nuisance parameters. We show on two datasets that this dependence can actually enhance the sensitivity to parameters of interest compared to baseline approaches. Finally, we provide a cautionary example for situations where uncertainty mitigating techniques may serve only to hide the true uncertainties.

Aishik Ghosh · Benjamin Nachman
-
(Poster) [ Visit Poster at Spot B2 in Virtual World ]

Studies of kilonovae, optical counterparts of binary neutron star mergers, rely on accurate simulation models. The most accurate simulations are computationally expensive; surrogate modelling provides a route to emulate the original simulations and therefore use them for statistical inference. We present a new implementation of surrogate construction using conditional variational autoencoders (cVAE) and discuss the challenges of this method. We additionally present model evaluation methods tailored to the scientific analyses of this field. We find that the cVAE surrogate produces errors well within a standard assumed systematic modelling uncertainty. We also report the results of our parameter inference study, finding our constrained parameters to be comparable with previously published results.

Kamile Lukosiute · Brian Nord
-
(Poster) [ Visit Poster at Spot B1 in Virtual World ]

In this work, we examine the robustness of state-of-the-art semi-supervised learning (SSL) algorithms when applied to morphological classification in modern radio astronomy. We test whether SSL can achieve performance comparable to the current supervised state of the art when using many fewer labelled data points and if these results generalise to using truly unlabelled data. We find that although SSL provides additional regularisation, its performance degrades rapidly when using very few labels, and that using truly unlabelled data leads to a significant drop in performance.

Inigo V Slijepcevic · Anna Scaife
-
(Poster) [ Visit Poster at Spot B0 in Virtual World ]   

Physical systems obey strict symmetry principles. We expect that machine learning methods that intrinsically respect these symmetries should perform better than those that do not. In this work we implement a principled model based on invariant scalars, and release open-source code. We apply this Scalars method to a simple chaotic dynamical system, the springy double pendulum. We show that the Scalars method outperforms state-of-the-art approaches for learning the properties of physical systems with symmetries, both in terms of accuracy and speed. Because the method incorporates the fundamental symmetries, we expect it to generalize to different settings, such as changes in the force laws in the system.

Weichi Yao · Kate Storey-Fisher · David W Hogg · Soledad Villar
-
(Poster) [ Visit Poster at Spot A3 in Virtual World ]   

Identifying the dynamics of physical systems requires a machine learning model that can assimilate observational data, but also incorporate the laws of physics. Neural Networks based on physical principles such as the Hamiltonian or Lagrangian NNs have recently shown promising results in generating extrapolative predictions and accurately representing the system's dynamics. We show that by additionally considering the actual energy level as a regularization term during training and thus using physical information as inductive bias, the results can be further improved. Especially in the case where only small amounts of data are available, these improvements can significantly enhance the predictive capability. We apply the proposed regularization term to a Hamiltonian Neural Network (HNN) and Constrained Hamiltonian Neural Network (CHHN) for a single and double pendulum, generate predictions under unseen initial conditions and report significant gains in predictive accuracy.

Sebastian Kaltenbach · psk S Koutsourelakis
-
(Poster) [ Visit Poster at Spot A2 in Virtual World ]

Physically-inspired latent force models offer an interpretable alternative to purely data driven tools for inference in dynamical systems. They carry the structure of differential equations and the flexibility of Gaussian processes, yielding interpretable parameters and dynamics-imposed latent functions. However, the existing inference techniques rely on the exact computation of posterior kernels which are seldom available in analytical form. Applications relevant to practitioners, such as diffusion equations, are hence intractable. We overcome these computational problems by proposing a variational solution to a general class of non-linear and parabolic partial latent force models. We demonstrate the efficacy and flexibility of our framework by achieving competitive performance on several tasks.

Jacob Moss · Felix Opolka · Pietro Lió
-
(Poster) [ Visit Poster at Spot A1 in Virtual World ]

We propose a paradigm shift in the data-driven modeling of the instrumental response field of telescopes. By adding a differentiable optical forward model into the modeling framework, we change the data-driven modeling space from the pixels to the wavefront. This allows to transfer a great deal of complexity from the instrumental response into the forward model while being able to adapt to the observations, remaining data-driven. Our framework allows to build powerful models that are physically motivated, interpretable, and that do not require special calibration data. We show that for a realistic setting of the Euclid space telescope, this framework represents a real performance breakthrough with reconstruction errors decreasing 5 times at observation resolution and more than 10 times for a 3x super-resolution. We successfully model chromatic variations of the instrument's response only using noisy broad-band in-focus observations.

Tobias Liaudat · Jean-Luc Starck
-
(Poster) [ Visit Poster at Spot A0 in Virtual World ]

Data-driven synthesis planning with machine learning is a key step in the design and discovery of novel inorganic compounds with desirable properties. Inorganic materials synthesis is often guided by chemists' prior knowledge and experience, built upon experimental trial-and-error that is both time and resource consuming. Recent developments in natural language processing (NLP) have enabled large-scale text mining of scientific literature, providing open source databases of synthesis information of synthesized compounds, material precursors, and reaction conditions (temperatures, times). In this work, we employ a conditional variational autoencoder (CVAE) to predict suitable inorganic reaction conditions for the crucial inorganic synthesis steps of calcination and sintering. We find that the CVAE model is capable of learning subtle differences in target material composition, precursor compound identities, and choice of synthesis route (solid-state, sol-gel) that are present in the inorganic synthesis space. Moreover, the CVAE can generalize well to unseen chemical entities and shows promise for predicting reaction conditions for previously unsynthesized compounds of interest.

Christopher Karpovich
-
(Poster) [ Visit Poster at Spot J3 in Virtual World ]   

In this work we introduce a novel approach to the pulsar classification problem in time-domain radio astronomy using a Born machine, often referred to as a quantum neural network. Using a single-qubit architecture, we show that the pulsar classification problem maps well to the Bloch sphere and that comparable accuracies to more classical machine learning approaches are achievable. We introduce a novel single-qubit encoding for the pulsar data used in this work and show that this performs comparably to a multi-qubit QAOA encoding.

Mo Kordzanganeh · Anna Scaife
-
(Poster) [ Visit Poster at Spot J2 in Virtual World ]   

Bayesian techniques have been shown to be extremely efficient in optimizing expensive to evaluate black box functions, in both computational (offline design) and physical (online experimental control) contexts. Optimizing physical systems often comes with extra challenges due to costs associated with changing parameters in real life experimentation, such as measurement location in physical space or mechanical/electrical actuation. In these cases, the cost of changing a given input parameter is often proportional to the magnitude of the change, for example the time cost associated with the distance a physical actuator must travel. To minimize these costs, optimization algorithms can simply limit the maximum distance travelled in input space during each step. However, hard restrictions on the travel distance inhibits global exploration advantages normally afforded by Bayesian optimization algorithms. In this work, we describe a proximal weighting term that can bias acquisition functions towards localized exploration, while still allowing for large travel distances if far away points are predicted to be valuable for observation. We describe a use case where this weighting is used to minimize the uncertainty of a particle accelerator Bayesian model in a smooth manner, which in turn, minimizes temporal costs associated with changing input parameters.

Ryan Roussel · Auralee Edelen
-
(Poster) [ Visit Poster at Spot J1 in Virtual World ]

This study uses a sigma-variational autoencoder to learn a latent space of the Sun using the 12 channels taken by Atmospheric Imaging Assembly (AIA) and the Helioseismic and Magnetic Imager (HMI) instruments on-board the NASA Solar Dynamics Observatory. The model is able to significantly compress the large image dataset to 0.19% of its original size while still proficiently reconstructing the original images. As a downstream task making use of the learned representation, this study demonstrates the of use the solar latent space as an input to improve the forecasts of the F30 solar radio flux index, compared to an off-the-shelf pretrained ResNet feature extractor. Finally, the developed models can be used to generate realistic synthetic solar images by sampling from the learned latent space.

Edward Brown · Christopher Bridges · Bernard Benson · Atilim Gunes Baydin
-
(Poster) [ Visit Poster at Spot J0 in Virtual World ]

Developing fast and accurate surrogates for physics-based coastal and ocean models is an urgent need due to the coastal flood risk under accelerating sea level rise, and the computational expense of deterministic numerical models. For this purpose, we develop the first digital twin of Earth coastlines with new physics-informed machine learning techniques extending the state-of-art Neural Operator. As a proof-of-concept study, we built Fourier Neural Operator (FNO) surrogates on the simulations of an industry-standard flood and ocean model (NEMO). The resulting FNO surrogate accurately predicts the sea surface height in most regions while achieving upwards of 45x acceleration of NEMO. We delivered an open-source CoastalTwin platform in an end-to-end and modular way, to enable easy extensions to other simulations and ML-based surrogate methods. Our results and deliverable provide a promising approach to massively accelerate coastal dynamics simulators, which can enable scientists to efficiently execute many simulations for decision-making, uncertainty quantification, and other research activities.

Peishi Jiang · Constantin Weisser · Björn Lütjens · Dava Newman
-
(Poster) [ Visit Poster at Spot I3 in Virtual World ]   

Motivated by oceanographic observational datasets, we propose a probabilistic neural network (PNN) model for calculating turbulent energy dissipation rates from vertical columns of velocity and density gradients in density stratified turbulent flows. We train and test the model on high-resolution simulations of decaying turbulence designed to emulate geophysical conditions similar to those found in the ocean. The PNN model outperforms a baseline theoretical model widely used to compute dissipation rates from oceanographic observations of vertical shear, being more robust in capturing the tails of the output distributions at multiple different time points during turbulent decay. A differential sensitivity analysis indicates that this improvement may be attributed to the ability of the network to capture additional underlying physics introduced by density gradients in the flow.

Sam Lewin
-
(Poster) [ Visit Poster at Spot I2 in Virtual World ]

The Lipschitz constant of the map between the input and output space represented by a neural network is a natural metric for assessing the robustness of the model. We present a new method to constrain the Lipschitz constant of dense deep learning models that can also be generalized to other architectures. The method relies on a simple weight normalization scheme during training which ensures every layer is 1-Lipschitz. A simple residual connection can then be used to make the model monotonic in any subset of its inputs, which is useful in scenarios where domain knowledge dictates such dependence. Examples can be found in algorithmic fairness requirements or, as presented here, in the classification of particle decays. Our normalization is minimally constraining and allows the underlying architecture to maintain higher expressiveness compared to other techniques which aim to either control the Lipschitz constant of the model or ensure its monotonicity. We show how the algorithm was used to train a powerful, robust, and interpretable discriminator for heavy-flavor decays in the LHCb Run 3 trigger system.

Niklas S Nolte · Ouail Kitouni · Mike Williams
-
(Poster) [ Visit Poster at Spot I1 in Virtual World ]

Discrete Fracture Network (DFN) flow simulations are commonly used to determine the outflow in fractured media for critical applications. Here, we extend the formulation of spatial graph neural networks with a new architecture, called Graph Informed Neural Network (GINN), to speed up the Uncertainty Quantification analyses for DFNs. We show that the GINN model allows better Monte Carlo estimates of the mean and standard deviation of the outflow of a test case DFN.

Stefano Berrone · Francesco Della Santa · Antonio Mastropietro · Sandra Pieraccini · Francesco Vaccarino
-
(Poster) [ Visit Poster at Spot I0 in Virtual World ]

Analyzing the high-dimensional data collected at the Large Hadron Collider experiments often requires a balance between maximizing sensitivity and maintaining interpretability by domain experts. We propose a new algorithm to construct powerful summary statistics for LHC processes in the form of simple symbolic expressions. First, we extract latent information from a chain of simulators; through symbolic regression on this data we then learn approximately sufficient statistics. Observables constructed in this way can be used as plug-in replacements for established summary statistics, potentially improving the precision of scientific results without adding any overhead. In Higgs production in weak boson fusion, our algorithm rediscovers well-known heuristics and proposes new, moderately complex formulas that rival the new physics reach of neural networks.

Nathalie Soybelman · Anja Butter · Tilman Plehn · Johann Brehmer
-
(Poster) [ Visit Poster at Spot H3 in Virtual World ]   

We present results exploring the role that probabilistic deep learning models can play in cosmology from large scale astronomical surveys through estimating the distances to galaxies (redshifts) from photometry. Due to the massive scale of data coming from these new and upcoming sky surveys, machine learning techniques using galaxy photometry are increasingly adopted to predict galactic redshifts which are important for inferring cosmological parameters such as the nature of Dark Energy. Associated uncertainty estimates are also critical measurements, however, common machine learning methods typically provide only point estimates and lack uncertainty information as outputs. We turn to Bayesian Neural Networks (BNNs) as a promising way to provide accurate predictions of redshift values. We have compiled a new galaxy training dataset from the Hyper Suprime-Cam Survey, designed to mimic large surveys, but over a smaller portion of the sky. We evaluate the performance and accuracy of photometric redshift (photo-z) predictions from photometry using machine learning, astronomical and probabilistic metrics. We find that while the Bayesian Neural Network did not perform as well as non-Bayesian Neural Networks if evaluated solely by point estimate photo-z values, BNNs can provide uncertainty estimates that are necessary for cosmology.

Evan Jones
-
(Poster) [ Visit Poster at Spot H2 in Virtual World ]   

A long-standing problem in the design of machine-learning tools for particle physics applications has been how to incorporate prior knowledge of physical symmetries. In this note we propose contrastive self-supervision as a solution to this problem, with jet physics as an example. Using a permutation-invariant transformer network, we learn a representation which outperforms hand-crafted competitors on a linear classification benchmark.

Barry M Dillon · Tilman Plehn · Gregor Kasieczka
-
(Poster) [ Visit Poster at Spot H1 in Virtual World ]   

Experimental advances enabling high-resolution external control create new opportunities to produce materials with exotic properties. In this work, we investigate how a multi-agent reinforcement learning approach can be used to design external control protocols for self-assembly. We find that a fully decentralized approach performs remarkably well even with a "coarse" level of external control. More importantly, we see that a partially decentralized approach, where we include information about surrounding regions allows us to better control our system towards some target distribution. We explain this by analyzing our approach as a partially-observed Markov decision process. With a partially decentralized approach, the agent is able to act more presciently, both by preventing the formation of undesirable structures and by better stabilizing target structures, as compared to a fully decentralized approach.

Shriram Chennakesavalu · Grant Rotskoff
-
(Poster) [ Visit Poster at Spot H0 in Virtual World ]

Recurrent neural networks (RNNs) are a class of neural networks that have emerged from the paradigm of artificial intelligence and has enabled lots of interesting advances in the field of natural language processing. Interestingly, these architectures were shown to be powerful ansatze to approximate the ground state of quantum systems [1]. Here, we build over the results of Ref. [1] and construct a more powerful RNN wave function ansatz in two dimensions. We use symmetry and annealing to obtain accurate estimates of ground state energies of the two-dimensional (2D) Heisenberg model, on the square lattice and on the triangular lattice. We show that our method is superior to Density Matrix Renormalisation Group (DMRG) for system sizes larger than or equal to 12x12 on the triangular lattice.

[1] M. Hibat-Allah, M. Ganahl, L. E. Hayward, R. G. Melko, and J. Carrasquilla, "Recurrent neural network wave functions," Physical Review Research, Jun 2020.

Mohamed Hibat Allah · Juan Carrasquilla · Roger Melko
-
(Poster) [ Visit Poster at Spot G3 in Virtual World ]

We show how to deal with uncertainties on the Reference Model predictions in a signal-model-independent new physics search strategy based on artificial neural networks. Our approach builds directly on the Maximum Likelihood ratio treatment of uncertainties as nuisance parameters for hypothesis testing that is routinely employed in high-energy physics. After presenting the conceptual foundations of the method, we show its applicability in a multivariate setup by studying the impact of two typical sources of experimental uncertainties in two-body final states at the LHC.

Gaia Grosso · Maurizio Pierini
-
(Poster) [ Visit Poster at Spot G2 in Virtual World ]   

Physical simulation-based optimization is a common task in science and engineering. Many such simulations produce image or tensor based outputs where the desired objective is a function of that image with respect to a high-dimensional parameter space. %some parameters. We develop a Bayesian optimization method leveraging tensor-based Gaussian process surrogates and trust region Bayesian optimization to effectively model the image outputs and to efficiently optimize these types of simulations, including an optical design problem and a radio-frequency tower configuration problem.

Wesley Maddox · Qing Feng · Max Balandat
-
(Poster) [ Visit Poster at Spot G1 in Virtual World ]

In the coming years, a new generation of sky surveys, in particular, Euclid Space Telescope (2022), and the Rubin Observatory’s Legacy Survey of Space and Time (LSST, 2023) will discover more than 200,000 new strong gravitational lenses, which represents an increase of more than two orders of magnitude compared to currently known sample sizes. Accurate and fast analysis of such large volumes of data under a robust statistical framework is therefore crucial for all sciences enabled by strong lensing. Here, we report on the application of simulation-based inference methods, in particular, density estimation techniques, to the predictions of the set of parameters of strong lensing systems from neural networks. This allows us to explicitly impose desired priors on lensing parameters, while guaranteeing convergence to the optimal posterior in the limit of perfect performance.

Ronan Legin
-
(Poster) [ Visit Poster at Spot G0 in Virtual World ]

Physical networks can learn desirable functions using local learning rules in space and time. Real learning systems, like natural neural networks, can learn out of equilibrium, on timescales comparable to their physical relaxation. Here we study coupled learning, a framework that supports learning in equilibrium in diverse physical systems. Relaxing the equilibrium assumption, we study experimentally and theoretically how physical resistor networks learn allosteric functions far from equilibrium. We show how fast learning produces oscillatory dynamics beyond a critical threshold, and that learning succeeds well beyond that threshold. These findings show how coupled learning rules may train systems much faster than assumed before, suggesting their applicability to slowly relaxing physical systems.

Nachi Stern · Andrea Liu
-
(Poster) [ Visit Poster at Spot F3 in Virtual World ]   

Triggering plays a vital role in high energy nuclear and particle physics experiments. Here we propose a new trigger system design for heavy charm quark events in proton+proton (p+p) collisions in the sPHENIX experiment at the Relativistic Heavy Ion Collider (RHIC). This trigger system selects a charm event created in p+p collision by identifying the topology of a charm-hadron (D^0) decays into a pair of oppositely charged kaon and pion particles. Classical approaches are based on statistical models, relying on complex hand-designed features, and are both cost-prohibitive and inflexible for discovering charm events from a large background of other collision events. The proposed neural network based trigger system takes into account unique high level features of charm events, using a stack of images that are embedded in a deep neural network. By incorporating two state-of-the-art graph neural networks, ParticleNet and SAGPool, we can learn high-level physics features and perform binary classification with simple geometrical track information. Our model attains nearly 75% accuracy and only requires moderate resources. With a small number neurons and simple input, our model is designed to be compatible with FPGAs and thereby enables extremely fast decision modules for real-time p+p collision events in the upcoming sPHENIX experiment at RHIC.

Yimin Zhu · Tingting Xuan
-
(Poster) [ Visit Poster at Spot F2 in Virtual World ]   

Autoencoders have useful applications in high energy physics in both compression and anomaly detection, particularly for jets: collimated showers of particles produced in collisions such as those at the CERN Large Hadron Collider. We explore the use of graph-based autoencoders, which operate on jets in their "particle cloud" representations and can leverage the interdependencies among the particles within jets, for such tasks. Additionally, we develop a differentiable approximation to the energy mover's distance via a graph neural network, which may subsequently be used as a reconstruction loss function for autoencoders.

Steven Tsan · Sukanya Krishna · Raghav Kansal · Anthony Aportela · Farouk Mokhtar · Daniel Diaz · Javier Duarte · Maurizio Pierini · jean-roch vlimant
-
(Poster) [ Visit Poster at Spot F1 in Virtual World ]   

Benford's Law (BL) or the Significant Digit Law defines the probability distribution of the first digit of numerical values in a data sample. This Law is observed in many naturally occurring datasets. It can be seen as a measure of naturalness of a given distribution and finds its application in areas like anomaly and fraud detection. In this work, we address the following question: Is the distribution of the Neural Network parameters related to the network's generalization capability? To that end, we first define a metric, MLH (Model Enthalpy), that measures the closeness of a set of numbers to BL. Second, we use MLH as an alternative to Validation Accuracy for Early Stopping, removing the need for a Validation set. We provide experimental evidence that even if the optimal size of the validation set is known beforehand, the peak test accuracy attained is lower than not using a validation set at all. Finally, we investigate the connection of BL to Free-Energy Principle and First Law of Thermodynamics, showing that MLH is a component of the internal energy of the learning system and optimization as an analogy to minimizing the total energy to attain equilibrium.

Surya Kant Sahu · Abhinav Java · Arshad Shaikh
-
(Poster) [ Visit Poster at Spot F0 in Virtual World ]   
The periodic pulsations of stars teach us about their underlying physical process. We present a convolutional autoencoder-based pipeline as an automatic approach to search for out-of-distribution anomalous periodic variables within the Zwicky Transient Facility (ZTF) catalog of periodic variables. We use an isolation forest to rank each periodic variable by the anomaly score. Our overall most anomalous events have a unique physical origin: they are mostly, red, cool, high variability, and irregularly oscillating periodic variables. Observational data suggest that they are most likely young and massive ($\simeq5-10$M$_\odot$) Red Giant or Asymptotic Giant Branch stars. Furthermore, we use the learned latent feature for the classification of periodic variables through a hierarchical random forest. This novel semi-supervised approach allows astronomers to identify the most anomalous events within a given physical class, significantly increasing the potential for scientific discovery.
Leon Chan · Siu Hei Cheung · Shirley Ho
-
(Poster) [ Visit Poster at Spot E3 in Virtual World ]   

Astronomical source deblending is the process of separating the contribution of individual stars or galaxies (sources) to an image comprised of multiple, possibly overlapping sources. Astronomical sources display a wide range of sizes and brightnesses and may show substantial overlap in images. Astronomical imaging data can further challenge off-the-shelf computer vision algorithms owing to its high dynamic range, low signal-to-noise ratio, and unconventional image format. These challenges make source deblending an open area of astronomical research, and in this work, we introduce a new approach called Partial-Attribution Instance Segmentation that enables source detection and deblending in a manner tractable for deep learning models. We provide a novel neural network implementation as a demonstration of the method.

Ryan Hausen
-
(Poster) [ Visit Poster at Spot E2 in Virtual World ]

We present ParSNIP, a novel method of producing generative models of sparse time series from astrophysical transients. Our approach combines a multilayer perceptron that describes the unknown physics of the sources with an explicit physics model that describes how light propagates through the universe and is observed by telescopes. We train this model using variational inference on real and simulated data. With a three-dimensional intrinsic representation, we can model time series with uncertainties of 4-6%. Our hybrid architecture disentangles intrinsic latent variables from ones that are explicitly modeled which enables many science applications. The ParSNIP model can perform redshift-independent photometric classification with a 2.4 ± 0.1 times lower false positive rate than state-of-the-art methods, identify new kinds of transients with 71 ± 28% accuracy on the simulated PLAsTiCC challenge dataset, and perform cosmological distance estimation with an accuracy of 6.9 ± 0.3% for Type Ia supernovae from Pan-STARRS1.

Kyle Boone
-
(Poster) [ Visit Poster at Spot E1 in Virtual World ]   

Both time-domain and gravitational-wave (GW) astronomy have gone through a revolution in the last decade. These two previously disjoint fields converged when the electromagnetic (EM) counterpart of a binary neutron star merger, GW170817, was discovered in 2017. However, despite the discovery rate of GWs steadily increasing, by several folds in each observing run of the LIGO/Virgo GW instruments, GW170817 remains the only success story of EMGW astronomy. While future GW detectors will detect even larger number of events, this does not guarantee corresponding increase in the number of EM counterparts discovered. In fact, the growing number is overwhelming since wide-field telescope surveys will have to contend with distinguishing the optical EM counterpart, called a kilonova, from the ever increasing number of ``vanilla'' transients objects they encounter during a GW follow-up operation. To this end, we present a novel tool based on a temporal convolutional network (TCN) architecture for Electromagnetic Counterpart Identification (El-CID). The overarching goal of El-CID is to slice through list of objects that are consistent with the GW sky localization, and determine which sources are consistent with kilonovae, allowing limited and judicious use of telescope and spectroscopic resources. Our classifier is trained on sparse early-time photometry and contextual information available during discovery. Apart from verifying our model on an extensive testing sample, we also show succesful results on real events during the previous LIGO/Virgo observing runs.

Deep Chatterjee
-
(Poster) [ Visit Poster at Spot E0 in Virtual World ]

In many cyber-physical systems, imaging can be an important but expensive or 'difficult to deploy' sensing modality. One such example is detecting combustion instability using flame images, where deep learning frameworks have demonstrated state-of-the-art performance. The proposed frameworks are also shown to be quite trustworthy such that domain experts can have sufficient confidence to use these models in real systems to prevent unwanted incidents. However, flame imaging is not a common sensing modality in engine combustors today. Therefore, the current roadblock exists on the hardware side regarding the acquisition and processing of high-volume flame images. On the other hand, the acoustic pressure time series is a more feasible modality for data collection in real combustors. To utilize acoustic time series as a sensing modality, we propose a novel cross-modal encoder-decoder architecture that can reconstruct cross-modal visual features from acoustic pressure time series in combustion systems. With the "distillation" of cross-modal features, the results demonstrate that the detection accuracy can be enhanced using the virtual visual sensing modality. By providing the benefit of cross-modal reconstruction, our framework can prove to be useful in different domains well beyond the power generation and transportation industries.

Tryambak Gangopadhyay · Vikram Ramanan · Chakravarthy S.R. · Soumik Sarkar
-
(Poster) [ Visit Poster at Spot D3 in Virtual World ]   

Elliptic partial differential equations (PDEs) are common in many areas of physics, from the Poisson equation in plasmas and incompressible flows to the Helmholtz equation in electromagnetism. Their numerical solution requires to solve linear systems which can become a bottleneck in terms of performance. The rise of computational power and inherent speed of GPUs offers exciting opportunities to solve PDEs by recasting them in terms of optimization problems. In plasma fluid simulations, the Poisson equation is solved, coupled to the charged species transport equations. We introduce PlasmaNet (https://gitlab.com/cerfacs/plasmanet), an open-source library written to study neural networks in plasma simulations. Previous work using PlasmaNet has shown significant speedup using neural networks to solve the Poisson equation compared to classical linear system solvers on this problem. Results also showed that coupling the neural network Poisson solver to plasma transport equations is a viable option in terms of accuracy. In this work, we attempt to solve a new class of elliptic differential equations, the screened Poisson equations using neural networks. These equations are used to infer the photoionization source term from the ionization rate in streamer discharges. The same methodology as that adopted for the Poisson equation is followed. A simulation running with three neural networks, one to solve the Poisson equation and two to solve the photoionization equations, yields accurate results, extending the range of applicability of the method developed previously.

Lionel Cheng · Michaël Bauerheim
-
(Poster) [ Visit Poster at Spot D2 in Virtual World ]

We design modular and rotationally equivariant DeepSets for predicting a continuous background quantity from a set of known foreground particles. Using this architecture, we address a crucial problem in Cosmology: modelling the continuous electron pressure field inside massive structures known as “clusters.” Given a simulation of pressureless, dark matter particles, our networks can directly and accurately predict the background electron pressure field. The modular design of our architecture makes it possible to physically interpret the individual components. Our most powerful deterministic model improves by 70% on the benchmark. A conditional-VAE extension yields further improvement by 7%, being limited by our small training set however. We envision use cases beyond theoretical cosmology, for example in soft condensed matter physics, or meteorology and climate science.

Leander Thiele · Miles Cranmer · Shirley Ho · David Spergel
-
(Poster) [ Visit Poster at Spot D1 in Virtual World ]   

The morphologies of galaxies and their relation with physical features have been extensively studied in the past. Galaxy morphology labels are usually created by humans and are used to train machine learning models. Human labels have been shown to contain biases in terms of observational parameters such as the resolution of the labeled images. In this work, we demonstrate that deep learning models trained on biased galaxy data produce biased predictions. We also propose a method to train neural networks that takes into account this inherent labeling bias. We show that our deep de-biasing method is able to reduce the bias of the models even when trained using biased data.

Esteban Medina · Guillermo Cabrera-Vives
-
(Poster) [ Visit Poster at Spot D0 in Virtual World ]

Simulations of high-energy particle collisions, such as those used at the Large Hadron Collider, are based on quantum field theory; however, many approximations are made in practice. For example, the simulation of the parton shower, which gives rise to objects called `jets', is based on a semi-classical approximation that neglects various interference effects. While there is a desire to incorporate interference effects, new computational techniques are needed to cope with the exponential growth in complexity associated to quantum processes. We present a classical algorithm called the quantum trellis to efficiently compute the un-normalized probability density over N-body phase space including all interference effects, and we pair this with an MCMC-based sampling strategy. This provides a potential path forward for classical computers and a strong baseline for approaches based on quantum computing.

Sebastian Macaluso · Kyle Cranmer
-
(Poster) [ Visit Poster at Spot C3 in Virtual World ]   

To characterize a physical system to behave as desired, either its underlying governing rules must be known a priori or the system itself be accurately measured. The complexity of full measurements of the system scales with its size. When exposed to real-world conditions, such as perturbations or time-varying settings, the system calibrated for a fixed working condition might require non-trivial re-calibration, a process that could be prohibitively expensive, inefficient and impractical for real-world use cases. In this work, we propose a learning procedure to obtain a desired target output from a physical system. We use Variational Auto-Encoders (VAE) to provide a generative model of the system function and use this model to obtain the required input of the system that produces the target output. We showcase the applicability of our method for two datasets in optical physics and neuroscience.

Babak Rahmani · Demetri Psaltis
-
(Poster) [ Visit Poster at Spot C2 in Virtual World ]   

Galaxy morphology is connected to various fundamental properties of a galaxy and studying the morphology of large samples of galaxies is central to understanding the relationship between morphology and the physics of galaxy formation and evolution. For the first time, we are able to use machine learning to estimate Bayesian posteriors for galaxy morphological parameters. To achieve this, GAMPEN, our machine learning framework, uses a spatial transformer network (STN), a convolutional neural network, and the Monte-Carlo Dropout technique. This novel application of an STN in astronomy also enables GAMPEN to crop out most secondary galaxies in the frame and focus on the galaxy of interest. We also demonstrate that by first training on simulations and then performing transfer learning using real data, we are able to achieve excellent estimates for morphological parameters of galaxies in the Hyper Suprime-Cam Wide survey, while using only a small amount of real training data.

Aritra Ghosh
-
(Poster) [ Visit Poster at Spot C1 in Virtual World ]   

Understanding the details of small-scale convection and storm formation is crucial to accurately represent the larger-scale planetary dynamics. Presently, atmospheric scientists run high-resolution, storm-resolving simulations to capture these kilometer-scale weather details. However, because they contain abundant information, these simulations can be overwhelming to analyze using conventional approaches. This paper takes a data-driven approach and jointly embeds spatial arrays of vertical wind velocities, temperatures, and water vapor information as three "channels" of a VAE architecture. Our "multi-channel VAE" results in more interpretable and robust latent structures than earlier work analyzing vertical velocities in isolation. Analyzing and clustering the VAE's latent space identifies weather patterns and their geographical manifestations in a fully unsupervised fashion. Our approach shows that VAEs can play essential roles in analyzing high-dimensional simulation data and extracting critical weather and climate characteristics.

Harshini Mangipudi · Griffin Mooers · Mike Pritchard · Tom Beucler · Stephan Mandt
-
(Poster) [ Visit Poster at Spot C0 in Virtual World ]

Low surface brightness galaxies (LSBGs), galaxies that are fainter than the dark night sky, are famously difficult to detect. However, studies of these galaxies are essential to improve our understanding of the formation and evolution of low-mass galaxies. In this work, we train a deep learning model using the Mask R-CNN framework on a set of simulated LSBGs inserted into images from the Dark Energy Survey (DES) Data Release 2 (DR2). This deep learning model is combined with several conventional image pre-processing steps to develop a pipeline for the detection of LSBGs. We apply this pipeline to the full DES DR2 coadd image dataset, and preliminary results show the detection of 22 large, high-quality LSBG candidates that went undetected by conventional algorithms. Furthermore, we find that the performance of our algorithm is greatly improved by including examples of false positives as an additional class during training.

Caleb Levy
-
(Poster) [ Visit Poster at Spot B3 in Virtual World ]

The SEparator for CApture Reactions (SECAR) is a next-generation recoil separator system at the Facility for Rare Isotope Beams (FRIB) designed for the direct measurement of capture reactions on unstable nuclei in inverse kinematics. To maximize the performance of the device, careful beam alignment to the central ion optical axis needs to be achieved. This can be difficult to attain through manual tuning by human operators without potentially leaving the system in a sub-optimal and irreproducible state. In this work, we present the first development of online Bayesian optimization with a Gaussian process model to tune an ion beam through a nuclear astrophysics recoil separator. We show that the method achieves small incoming angular deviations (0-1 mrad) in an efficient and reproducible manner that is at least 3 times faster than standard hand-tuning. This method is now routinely used for all separator tuning.

Sara Miskovich
-
(Poster) [ Visit Poster at Spot B2 in Virtual World ]

Reliable modeling of conditional densities is important for quantitative scientific fields such as particle physics. In domains outside physics, implicit quantile neural networks (IQN) have been shown to provide accurate models of conditional densities. We present a successful application of IQNs to jet simulation and correction using the tools and simulated data from the Compact Muon Solenoid (CMS) Open Data portal.

Michelle Kuchera · Raghuram Ramanujan
-
(Poster) [ Visit Poster at Spot B1 in Virtual World ]   

Astronomical transients are stellar objects that become temporarily brighter on various timescales and have led to some of the most significant discoveries in cosmology and astronomy. Some of these transients are the explosive deaths of stars known as supernovae while others are rare, exotic, or entirely new kinds of exciting stellar explosions. New astronomical sky surveys are observing unprecedented numbers of multi-wavelength transients, making standard approaches of visually identifying new and interesting transients infeasible. To meet this demand, we present two novel methods of quickly and automatically detecting anomalous transient light curves in real-time. Both methods are based on the simple idea that if the light curves from a known population of transients can be accurately modelled, any deviations from model predictions are likely anomalies. The first approach is a probabilistic neural network built using Temporal Convolutional Networks (TCNs) and the second is an interpretable Bayesian parametric model of a transient. We show that the flexibility of neural networks, the attribute that makes them such a powerful regressor for many tasks, is what makes them less suitable for anomaly detection when compared with our parametric model.

Daniel Muthukrishna
-
(Poster) [ Visit Poster at Spot B0 in Virtual World ]

We find a heterogeneity in both complex and real valued neural networks with the insight from wave optics, claiming a much more important role of phase in the weight matrix than its amplitude counterpart. In complex-valued neural networks, we show that among different types of pruning, the weight matrix with only phase information preserved achieves the best accuracy, which holds robustly under various depths and widths. The conclusion can be generalized to real-valued neural networks, where signs take the place of phases. These inspiring findings enrich the techniques of network pruning and binary computation.

Yuqi Nie
-
(Poster) [ Visit Poster at Spot A3 in Virtual World ]

Solar radio flux along with geomagnetic indices are important indicators of solar activity and its effects. Extreme solar events such as flares and geomagnetic storms can negatively affect the space environment including satellites in low-Earth orbit. Therefore, forecasting these space weather indices is of great importance in space operations and science. In this study, we propose a model based on long short-term memory neural networks to learn the distribution of time series data with the capability to provide a simultaneous multivariate 27-day forecast of the space weather indices using time series as well as solar image data. We show a 30–40% improvement of the root mean-square error while including solar image data with time series data compared to using time series data alone. Simple baselines such as a persistence and running average forecasts are also compared with the trained deep neural network models. We also quantify the uncertainty in our prediction using a model ensemble.

Bernard Benson · Christopher Bridges · Atilim Gunes Baydin
-
(Poster) [ Visit Poster at Spot A2 in Virtual World ]   

Symmetries are a fundamental property of functions associated with data. A key function for any dataset is its probability density, and the symmetries thereof are referred to as the symmetries of the dataset itself. We provide a rigorous statistical notion of symmetry for a dataset, which involves reference datasets that we call "inertial" in analogy to inertial frames in classical mechanics. Then, we construct a novel approach to automatically discover symmetries from a dataset using a deep learning method based on an adversarial neural network. We test our method on the LHC Olympics dataset. Symmetry discovery may lead to new insights and can reduce the effective dimensionality of a dataset to increase its effective statistics.

Krish Desai · Benjamin Nachman · Jesse Thaler
-
(Poster) [ Visit Poster at Spot A1 in Virtual World ]

Semiconductor device models are essential to understand the charge transport in thin film transistors (TFTs). Using these TFT models to draw inference involves estimating parameters used to fit to the experimental data. These experimental data can involve extracted charge carrier mobility or measured current. Estimating these parameters help us draw inferences about device performance. Fitting a TFT model for a given experimental data using the model parameters relies on manual fine tuning of multiple parameters by human experts. Several of these parameters may have confounding effects on the experimental data, making their individual effect extraction a non-intuitive process during manual tuning. To avoid this convoluted process, we propose a new method for automating the model parameter extraction process resulting in an accurate model fitting. In this work, model choice based approximate Bayesian computation (aBc) is used for generating the posterior distribution of the estimated parameters using observed mobility at various gate voltage values. Furthermore, it is shown that the extracted parameters can be accurately predicted from the mobility curves using gradient boosted trees. This work also provides a comparative analysis of the proposed framework with fine-tuned neural networks wherein the proposed framework is shown to perform better.

Neel Chatterjee · Somya Sharma · Ansu Chatterjee
-
(Poster) [ Visit Poster at Spot A0 in Virtual World ]   

Progress in machine learning (ML) relies on an appropriate encoding of inductive biases. Useful biases often exploit symmetries in the prediction problem, such as convolutional networks relying on translation equivariance. Automatically discovering these useful symmetries holds the potential to greatly improve the performance of ML systems, but still remains a challenge. In this work, we focus on sequential prediction problems and take inspiration from Noether's theorem to reduce the problem of finding inductive biases to meta-learning useful conserved quantities. We propose Noether Networks: a new type of architecture where a meta-learned conservation loss is optimized inside the prediction function. We show, theoretically and experimentally, that Noether Networks improve prediction quality, providing a framework for discovering inductive biases in sequential problems.

Ferran Alet · Dylan Doblar · Allan Zhou · Josh Tenenbaum · Kenji Kawaguchi · Chelsea Finn
-
(Poster) [ Visit Poster at Spot H3 in Virtual World ]   

The particle-flow (PF) algorithm is used in general-purpose particle detectors to reconstruct a comprehensive particle-level view of the collision by combining information from different subdetectors. A graph neural network model, known as the MLPF algorithm, has been developed to substitute rule-based PF. However, understanding the model's decision making is not straightforward, especially given the complexity of the set-to-set prediction task, dynamic graph building, and message-passing steps. In this paper, we adapt the layerwise-relevance propagation technique to the MLPF algorithm to gauge the relevant nodes and features for its predictions. Through this we gain insight into the model's decision-making.

Farouk Mokhtar · Raghav Kansal · Daniel Diaz · Javier Duarte · Maurizio Pierini · jean-roch vlimant
-
(Poster) [ Visit Poster at Spot H2 in Virtual World ]   

Physics-informed neural networks allow models to be trained by physical laws described by general nonlinear partial differential equations. However, traditional architectures struggle to solve more challenging time-dependent problems. In this work, we present a novel physics-informed framework for solving time-dependent partial differential equations. Our proposed model utilizes discrete cosine transforms to encode spatial frequencies and recurrent neural networks to process the time evolution, achieving state-of-the-art performance on the Taylor-Green vortex relative to other physics-informed baseline models.

Benjamin Wu · Oliver Hennigh · Jan Kautz · Sanjay Choudhry · Wonmin Byeon
-
(Poster) [ Visit Poster at Spot H1 in Virtual World ]

Anomalous light curves indicate rare and as yet unexplainable phenomena associated with astronomical sources. With existing large surveys like the Zwicky Transient Facility (ZTF), and upcoming ones such as the Vera C. Rubin Observatory Legacy Survey of Space and Time (LSST) that will observe astrophysical transients at all time scales and produce archival light curves in the billions, there is an immediate need for methods that reveal anomalous light curves. Previous work explores anomalous light curve detection, but little work has gone into finding analogs of such light curves. That is, given a light curve of interest, can we find other examples in the dataset that behave similarly? We present such a pipeline that (1) identifies anomalous light curves, and (2) finds additional examples of specific rare classes, in a large corpora of light curves. We apply this method to Kepler data, finding around 5000 previously unknown anomalies, and present a subset of these anomalies along with their potential astrophysical classification.

Kushal Tirumala
-
(Poster) [ Visit Poster at Spot H0 in Virtual World ]   

Quantifying the morphology of galaxies has been an important task in astrophysics to understand the formation and evolution of galaxies. In recent years, the data size has been dramatically increasing due to several on-going and upcoming surveys. Labeling and identifying interesting objects for further investigations has been explored by citizen science through the Galaxy Zoo Project and by machine learning in particular with the convolutional neural networks (CNNs). In this work, we explore the usage of Vision Transformer (ViT) for galaxy morphology classification for the first time. We show that ViT could reach competitive results compared with CNNs, and is specifically good at classifying smaller-sized and fainter galaxies. With this promising preliminary result, we believe the ViT network architecture can be an important tool for galaxy morphological classification for the next generation surveys. We plan to open source our repository in the near future.

Joshua Yao-Yu Lin
-
(Poster) [ Visit Poster at Spot G3 in Virtual World ]   

This paper will provide a comprehensive end-to-end pipeline to classify triggers verse background events, make online decisions to filter signal data, and enable the intelligent trigger system for efficient data collection in the sPHENIX Data Acquisition System(DAQ). The pipeline starts with the coordinates of pixel hits that are lightened by passing particles in the detector, applies three-stages of event processing (hits clustering, track reconstruction, and trigger detection), and finally, labels all processed events with the binary tag of trigger v.s. background events. The whole pipeline consists of deterministic algorithms such as clustering pixels to reduce event size, tracking reconstruction to predict candidate edges, and advanced graph neural network-based models for recognizing the entire jet pattern. In particular, we apply the Massage Passing Graph Neural Network to predict links between hits and reconstruct tracks and a hierarchical pool algorithm (DiffPool) to make the graph-level trigger detection. We attain an impressive performance ( larger than 70% accuracy) for trigger detection with only 3200 neuron weights in the end-to-end pipeline.

Tingting Xuan · Yimin Zhu
-
(Poster) [ Visit Poster at Spot G2 in Virtual World ]   

Particle accelerators require routine tuning during operation and when new isotope species are introduced. This is a complex process requiring many hours from experienced operators. The difficult control aspect of this problem is challenging for traditional approaches, but offers to be a promising candidate for reinforcement learning. We aim to develop an automated tuning procedure for the accelerators at TRIUMF, starting with the Off-Line Ion Source (OLIS) portion of the Isotope Separator and Accelerator (ISAC) facility. In this early stage of research, we show that the method of Recurrent Deep Deterministic Policy Gradients (RDPG) is successful in learning accelerator tuning procedures for a simple simulated environment representing the OLIS section.

David Wang
-
(Poster) [ Visit Poster at Spot G1 in Virtual World ]   

In this paper, we develop a Wasserstein autoencoder (WAE) with a hyperspherical prior for multimodal data in the application of inertial confinement fusion. Unlike a typical hyperspherical generative model that requires computationally inefficient sampling from distributions like the von Mis Fisher, we sample from a normal distribution followed by a projection layer before the generator. Finally, to determine the validity of the generated samples, we exploit a known relationship between the modalities in the dataset as a scientific constraint, and study different properties of the proposed model.

Ankita Shukla · Rushil Anirudh · Eugene Kur · Jayaraman Thiagarajan · Timo Bremer · Brian K Spears · Tammy Ma · Pavan Turaga
-
(Poster) [ Visit Poster at Spot G0 in Virtual World ]   

We establish a direct connection between general tensor networks and deep feed-forward artificial neural networks. The core of our results is the construction of neural-network layers that efficiently perform tensor contractions, and that use commonly adopted non-linear activation functions. The resulting deep networks feature a number of edges that closely matches the contraction complexity of the tensor networks to be approximated. In the context of many-body quantum states, this result establishes that neural-network states have strictly the same or higher expressive power than practically usable variational tensor networks. As an example, we show that all matrix product states can be efficiently written as neural-network states with a number of edges polynomial in the bond dimension and depth logarithmic in the system size. The opposite instead does not hold true, and our results imply that there exist quantum states that are not efficiently expressible in terms of matrix product states or practically usable PEPS, but that are instead efficiently expressible with neural network states.

Or Sharir · Amnon Shashua · Giuseppe Carleo
-
(Poster) [ Visit Poster at Spot F3 in Virtual World ]   
The physical properties of the outer layers of stellar atmospheres (temperature, velocity and/or magnetic field) can be inferred by inverting the radiative transfer forward problem. The main obstacle is that the model required to synthesize the strong lines that sample the stellar chromospheres is extremely time consuming, which makes the solution of the inverse problem not very practical. Here we leverage graph networks to predict the population number density of the atom energy levels simply from the temperature and optical depth stratification. We demonstrate that a speedup of a factor 10$^3$ can be obtained with a negligible impact on precision. This opens up the possibility of large-scale synthesis in three-dimensional models and routine inversion of observations to infer the 3D properties of the solar and stellar chromospheres.
Andreu Vicente Arevalo
-
(Poster) [ Visit Poster at Spot F2 in Virtual World ]   

This work investigates the interaction between a fluid solver with a CNN-based Poisson solver for unsteady incompressible flow simulations. During training, the network prediction is used to continue in time the computation, embedding the influence of the network prediction on the simulation using a long-term loss. This study investigates three implementations of such a loss, as well as the number of look-ahead iterations. On all test cases, results show that long-term losses are always beneficial. Interestingly, a partial implementation without differentiable solver is found accurate, robust and less costly than full implementation.

Ekhi Ajuria Illarramendi · Michaël Bauerheim
-
(Poster) [ Visit Poster at Spot F1 in Virtual World ]

The Zwicky Transient Facility (ZTF), a state-of-the-art optical robotic sky survey, registers on the order of a million transient events - such as supernova explosions, changes in brightness of variable sources, or moving object detections - every clear night, generating real-time alerts. We present Alert-Classifying Artificial Intelligence (ACAI), an open-source deep-learning framework for the phenomenological classification of the ZTF alerts. ACAI uses a set of five binary classifiers to characterize objects, which in combination with the auxiliary/contextual event information available from alert brokers, provides a powerful tool for alert stream filtering tailored to different science cases, including early identification of supernova-like and anomalous transient events. We report on the performance of ACAI during the first months of deployment in a production setting.

Dmitry Duev
-
(Poster) [ Visit Poster at Spot F0 in Virtual World ]   

Forecasting bushfire spread is an important element in fire prevention and response efforts. Empirical observations of bushfire spread can be used to estimate fire response under certain conditions. These observations form rate-of-spread models, which can be used to generate simulations. We use machine learning to drive the emulation approach for bushfires and show that emulation has the capacity to closely reproduce simulated fire-front data. We present a preliminary emulator approach with the capacity for fast emulation of complex simulations. Large numbers of predictions can then be generated as part of ensemble estimation techniques - which provide more robust and reliable forecats of stochastic systems.

Andrew Bolt · Petra Kuhnert · Joel Dabrowski
-
(Poster) [ Visit Poster at Spot E3 in Virtual World ]

The IceCube Neutrino Observatory is an astroparticle physics experiment to investigate neutrinos from the universe. Our task is to classify neutrinos events and reconstruct events of interest. Graph Neural Network (GNN) has achieved great success in this area due to its powerful modeling ability for the irregular grid structure of the detectors. Unlike existing GNN-based methods, which neglect the quality of the constructed graph for the GNN to operate on, we focus on the graph construction step via the score-based generative model to enhance the performance of downstream tasks. Extensive experiments verify the efficacy of our method.

Yiming Sun · Zixing Song · Irwin King
-
(Poster) [ Visit Poster at Spot E2 in Virtual World ]   

In this work we introduce group-equivariant self-attention models to address the problem of explainable radio galaxy classification in astronomy. We evaluate various orders of both cyclic and dihedral equivariance, and show that including equivariance as a prior both reduces the number of epochs required to fit the data and results in improved performance. We highlight the benefits of equivariance when using self-attention as an explainable model and illustrate how equivariant models statistically attend the same features in their classifications as human astronomers.

Micah Bowles
-
(Poster) [ Visit Poster at Spot E1 in Virtual World ]

Efficient sampling of complex high-dimensional probability densities is a central task in computational science. Machine learning techniques based on autoregressive neural networks provide good approximations to probability distributions of interest in physics. In this work, we propose a systematic way to make this approximation unbiased by using it as an automatic generator of Markov chain Monte Carlo cluster updates. Symmetry enforcing and variable-size cluster updates are found to be essential to the success of this technique. We test our method for first- and second-order phase transitions of classical spin systems, showing its viability for critical systems and in the presence of metastable states.

Dian Wu
-
(Poster) [ Visit Poster at Spot E0 in Virtual World ]   

Development of the new methods of surface water observation is crucial in the perspective of increasingly frequent extreme hydrological events related to global warming and increasing demand for water. Orthophotos and digital surface models (DSMs) obtained using UAV photogrammetry can be used to determine the water surface height of a river. However, this task is difficult due to disturbances of the water surface on DSMs caused by limitations of photogrammetric algorithms. In this study, machine learning models were used to extract a single water surface elevation value. A brand new dataset has been prepared specifically for this purpose by hydrology and photogrammetry experts. The new method is an important step toward automating water surface level measurements with high spatial and temporal resolution. Such data can be used to validate and calibrate of hydrological, hydraulic and hydrodynamic models making hydrological forecasts more accurate, in particular predicting extreme and dangerous events such as floods or droughts. For our knowledge this is the first approach in which dataset was created for this purpose and deep learning models were used for this task. The obtained results have better accuracy compared to manual methods of determining WSE (water-surface-elevation) from photogrammetric DSMs. Additionally, neuroevolution algorithm was employed to explore different architectures to find optimal models.

Marcin Pietroń
-
(Poster) [ Visit Poster at Spot D3 in Virtual World ]

Graph convolutional neural networks (GCNNs) have been shown to accurately predict materials properties by featurizing local atomic environments. However, such models have not yet been utilized for predicting per-site features such as Bader charge, magnetic moment, or site-projected band centers. In this work, we develop a per-site crystal graph convolutional neural network that predicts a wide array of per-site properties. This model outperforms a per-element average baseline, and is thus capturing the effect of the neighborhood around each atom. Using magnetic moments as a case study, we explore an example of underlying physics the per-site model is able to learn.

Jessica Karaguesian · Jaclyn Lunger · Rafa Gomez-Bombarelli
-
(Poster) [ Visit Poster at Spot D2 in Virtual World ]   

Strong lensing is a stunning physics phenomenon through which the light emitted from a distant cosmological source is distorted by the gravitational field of a foreground object distributed along the line of sight. Strong lensing observations are important, since, from their analysis, it is possible to infer properties of both the light-emitting source and the lens. In particular, precise lens modelling allows for the extraction of precious information on the distribution of dark matter in galaxies and clusters, which can provide tight constraints on several cosmological parameters. In this work, we consider the case where a comprehensive closed-form parametric model of the lens potential is only partially available, and we propose to model missing mass along the line-of-sight with a deep neural network. We incorporate the network within a fully differentiable, physically sound strong lensing simulator, and we train it via maximum likelihood estimation in an end-to-end fashion. Our experiments show that the model is able to effectively interact with the other components of the simulator and can successfully retrieve the underlying potential without any assumption on its form.

Luca Biggio
-
(Poster) [ Visit Poster at Spot D1 in Virtual World ]   

Identifying string theory vacua with desired physical properties at low energies requires searching through high-dimensional solution spaces -- collectively referred to as the string landscape. We highlight that this search problem is amenable to reinforcement learning and genetic algorithms. In the context of flux vacua, we are able to reveal novel features (suggesting previously unidentified symmetries) in the string theory solutions required for properties such as the string coupling. In order to identify these features robustly, we combine results from both search algorithms, which we argue is imperative for reducing sampling bias.

Andreas Schachner · Sven Krippendorf · Alex Cole · Gary Shiu
-
(Poster) [ Visit Poster at Spot D0 in Virtual World ]   

The solar wind consists of charged particles ejected from the Sun into interplanetary space and towards Earth. Understanding the magnetic field of the solar wind is crucial for predicting future space weather and planetary atmospheric loss. A lack of labeled data makes an automated detection of these discontinuities challenging. We propose Deep-SWIM, an approach leveraging advances in contrastive learning, pseudo-labeling and online hard example mining to robustly identify discontinuities in solar wind magnetic field data. Through a systematic ablation study, we show that we can accurately classify discontinuities despite learning from only limited labeled data. Additionally, we show that our approach generalizes well and produces results that agree with expert hand-labeling.

Hala Lamdouar · Sairam Sundaresan · Anna Jungbluth · Marcella Scoczynski Ribeiro Martins · Anthony Sarah · Andres Munoz-Jaramillo
-
(Poster) [ Visit Poster at Spot C3 in Virtual World ]   

The Fourier Neural Operator (FNO) is a learning-based method for efficiently simulating partial differential equations. We propose the Factorized Fourier Neural Operator (F-FNO) that allows much better generalization with deeper networks. With a careful combination of the Fourier factorization, weight sharing, the Markov property, and residual connections, F-FNOs achieve a six-fold reduction in error on the most turbulent setting of the Navier-Stokes benchmark dataset. We show that our model maintains an error rate of 2% while still running an order of magnitude faster than a numerical solver, even when the problem setting is extended to include additional contexts such as viscosity and time-varying forces. This enables the same pretrained neural network to model vastly different conditions.

Alasdair Tran · Alexander Mathews · Lexing Xie · Cheng Soon Ong
-
(Poster) [ Visit Poster at Spot C2 in Virtual World ]   

Computational fluid dynamics (CFD) is an invaluable tool in modern physics but the time-intensity and computational complexity limit its applicability to practical problems, e.g. in medicine. Surrogate methods could speed up inference and allow use in such time-critical applications. We consider the problem of estimating hemodynamic quantities (i.e. related to blood flow) on the surface of 3D artery geometries and employ anisotropic graph convolution in an end-to-end SO(3)-equivariant neural network operating directly on the polygonal surface mesh. We show that our network can accurately predict hemodynamic vectors for each vertex on the surface mesh with normalised mean absolute error of 0.6 [%] and approximation accuracy of 90.5 [%], demonstrating its feasibility as surrogate method for CFD.

Julian Suk · Phillip Lippe · Christoph Brune · Jelmer Wolterink
-
(Poster) [ Visit Poster at Spot C1 in Virtual World ]   

Encoder-Decoder networks such as U-Nets have been applied successfully in a wide range of computer vision tasks, especially for image segmentation of different flavours across different fields. Nevertheless, most applications lack of a satisfying quantification of the uncertainty of the prediction. Yet, a well calibrated segmentation uncertainty can be a key element for scientific applications such as precision cosmology. In this on-going work, we explore the use of the probabilistic version of the \unetend, recently proposed by Hohl et al (2018), and adapt it to automate the segmentation of galaxies for large photometric surveys. We focus especially on the probabilistic segmentation of overlapping galaxies, also known as blending. We show that, even when training with a single ground truth per input sample, the model manages to properly capture a pixel-wise uncertainty on the segmentation map. Such uncertainty can then be propagated further down the analysis of the galaxy properties. To our knowledge, this is the first time such an experiment is applied for galaxy deblending in astrophysics.

Hubert Bretonniere · Marc Huertas-Company
-
(Poster) [ Visit Poster at Spot C0 in Virtual World ]

We present Turbo-Sim, a generalised autoencoder framework derived from principles of information theory that can be used as a generative model. By maximising the mutual information between the input and the output of both the encoder and the decoder, we are able to rediscover the loss terms usually found in adversarial autoencoders as well as various more sophisticated related models. Our generalised framework makes these models mathematically interpretable and allows for a diversity of new ones by setting the weight of each term separately. The framework is also independent of the intrinsic architecture of the encoder and the decoder thus leaving a wide choice for the building blocks of the whole network. We apply Turbo-Sim to a collider physics generation problem: the transformation of the properties of several particles from a theory space, right after the collision, to an observation space, right after the detection in an experiment.

Guillaume Quétant · Vitaliy Kinakh · Tobias Golling · Slava Voloshynovskiy
-
(Poster) [ Visit Poster at Spot B3 in Virtual World ]

Devising optimal interventions for diffusive systems often requires the solution of the Hamilton-Jacobi-Bellman ({HJB}) equation, a nonlinear backward partial differential equation ({PDE}), that is, in general, nontrivial to solve. Existing control methods either tackle the HJB directly with grid-based PDE solvers, or resort to iterative stochastic path sampling to obtain the necessary controls. Here, we present a framework that interpolates between these two approaches. By reformulating the optimal interventions in terms of logarithmic gradients (\emph{scores}) of two forward probability flows, and by employing deterministic particle methods for solving Fokker-Planck equations, we introduce a novel \emph{deterministic} particle framework that computes the required optimal interventions in \emph{one-shot}.

Dimitra Maoutsa · Manfred Opper
-
(Poster) [ Visit Poster at Spot B2 in Virtual World ]

Deep learning tools are being used extensively in a range of scientific domains; in particular, there has been a steady increase in the number of geometric deep learning solutions proposed to a variety of problems involving structured or relational scientific data. In this work, we report on the performance of graph segmentation methods for two scientific datasets from different fields. Based on observations, we were able to discern the individual impact each type of graph segmentation methods has on the dataset and how they can be used as a precursors to deep learning pipelines.

Rajat Sahay · Savannah Thais
-
(Poster) [ Visit Poster at Spot B1 in Virtual World ]

Calorimeter simulation is the most computationally expensive part of Monte Carlo generation of samples necessary for analysis of experimental data at the Large Hadron Collider (LHC). The High-Luminosity upgrade of the LHC would require an even larger amount of such samples. We present a technique based on Discrete Variational Autoencoders (DVAEs) to simulate particle showers in Electromagnetic Calorimeters. We discuss how this work paves the way towards exploration of quantum annealing processors as sampling devices for generation of simulated High Energy Physics datasets.

Abhishek Abhishek
-
(Poster) [ Visit Poster at Spot B0 in Virtual World ]

We present a novel kernel-based anomaly detection algorithm for model-independent new physics searches. The model is based on a re-weighted version of kernel logistic regression and it aims at learning the likelihood ratio test statistics from simulated anomaly-free background data and experimental data. Model-independence is enforced by avoiding any prior assumption about the presence or shape of new physics components in the data. This is made possible by kernel methods being non-parametric models that, given enough data, can approximate any continuous function and adapt to potentially any type of anomaly. This model shows dramatic advantages compared to similar neural network implementations in terms of training times and computational resources, while showing comparable performances. We test the model on datasets of different dimensionalities showing that modern implementations of kernel methods are competitive options for large scale problems.

Marco Letizia · Lorenzo Rosasco · Marco Rando
-
(Poster) [ Visit Poster at Spot A3 in Virtual World ]

Ordinary Differential Equation Variational Auto-Encoder (ODE2VAE) is a deep latent variable model that aims to learn complex distributions over high-dimensional sequential data and their low-dimensional representations in a hierarchical latent space. The hierarchical organization of the latent space embeds a physics-guided inductive bias in the model. In this paper, we analyze the latent representations inferred by the ODE2VAE model over three different physical motion datasets: bouncing balls, projectile motion, and simple pendulum. We show that the model is able to learn meaningful latent representations to an extent without any supervision.

Batuhan Koyuncu
-
(Poster) [ Visit Poster at Spot A2 in Virtual World ]

In recent years, the Transformer architecture has proven to be very successful in sequence processing, but its application to other data structures, such as graphs, has remained limited due to the difficulty of properly defining positions. Here, we present the Spectral Attention Network (SAN), which uses a learned positional encoding (LPE) that can take advantage of the full Laplacian spectrum to learn the position of each node in a given graph. This LPE is then added to the node features of the graph and passed to a fully-connected Transformer. By leveraging the full spectrum of the Laplacian, our model is theoretically powerful in distinguishing graphs, and can better detect similar sub-structures from their resonance. Further, by fully connecting the graph, the Transformer does not suffer from over-squashing, an information bottleneck of most GNNs, and enables better modeling of physical phenomenons such as heat transfer and electric interaction. When tested empirically on a set of 4 standard datasets, our model performs on par or better than state-of-the-art GNNs, and outperforms any attention-based model by a wide margin, becoming the first fully-connected architecture to perform well on graph benchmarks.

Devin Kreuzer · Will Hamilton · Vincent Létourneau
-
(Poster) [ Visit Poster at Spot A1 in Virtual World ]

Generating the periodic structure of stable materials is a long-standing challenge for the material design community. This task is difficult because stable materials only exist in a low-dimensional subspace of all possible periodic arrangements of atoms: 1) the coordinates must lie in the local energy minimum defined by quantum mechanics, and 2) different atom types have complex, yet specific bonding preferences. Existing methods fail to incorporate these factors and often lack proper invariances. We propose a Crystal Diffusion Variational Autoencoder (CDVAE) that captures the physical inductive bias of material stability. By learning from the data distribution of stable materials, the decoder generates materials in a diffusion process that moves atomic coordinates towards a lower energy state and updates atom types to satisfy bonding preferences between neighbors. Our model also explicitly encodes interactions across periodic boundaries and respects permutation, translation, rotation, and periodic invariances. We generate significantly more realistic materials than past methods in two tasks: 1) reconstructing the input structure, and 2) generating valid, diverse, and realistic materials. Our contribution also includes the creation of several standard datasets and evaluation metrics for the broader machine learning community.

Tian Xie · Xiang Fu · Octavian Ganea · Regina Barzilay · Tommi Jaakkola
-
(Poster) [ Visit Poster at Spot A0 in Virtual World ]

We present the use of self-supervised learning to explore and exploit large unlabeled datasets. Focusing on 42 million galaxy images from the latest data release of the Dark Energy Spectroscopic Instrument (DESI) Legacy Imaging Surveys, we first train a self-supervised model to distil low-dimensional representations that are robust to symmetries, uncertainties, and noise in each image. We then use the representations to construct and publicly release an interactive semantic similarity search tool. We demonstrate how our tool can be used to rapidly discover rare objects given only a single example, increase the speed of crowd-sourcing campaigns, flag bad data, and construct and improve training sets for supervised applications. While we focus on images from sky surveys, the technique is straightforward to apply to any scientific dataset of any dimensionality. The similarity search web app can be found at .

George Stein

Author Information

Anima Anandkumar (NVIDIA / Caltech)

Anima Anandkumar is a Bren professor at Caltech CMS department and a director of machine learning research at NVIDIA. Her research spans both theoretical and practical aspects of large-scale machine learning. In particular, she has spearheaded research in tensor-algebraic methods, non-convex optimization, probabilistic models and deep learning. Anima is the recipient of several awards and honors such as the Bren named chair professorship at Caltech, Alfred. P. Sloan Fellowship, Young investigator awards from the Air Force and Army research offices, Faculty fellowships from Microsoft, Google and Adobe, and several best paper awards. Anima received her B.Tech in Electrical Engineering from IIT Madras in 2004 and her PhD from Cornell University in 2009. She was a postdoctoral researcher at MIT from 2009 to 2010, a visiting researcher at Microsoft Research New England in 2012 and 2014, an assistant professor at U.C. Irvine between 2010 and 2016, an associate professor at U.C. Irvine between 2016 and 2017 and a principal scientist at Amazon Web Services between 2016 and 2018.

Kyle Cranmer (New York University)

Kyle Cranmer is an Associate Professor of Physics at New York University and affiliated with NYU's Center for Data Science. He is an experimental particle physicists working, primarily, on the Large Hadron Collider, based in Geneva, Switzerland. He was awarded the Presidential Early Career Award for Science and Engineering in 2007 and the National Science Foundation's Career Award in 2009. Professor Cranmer developed a framework that enables collaborative statistical modeling, which was used extensively for the discovery of the Higgs boson in July, 2012. His current interests are at the intersection of physics and machine learning and include inference in the context of intractable likelihoods, development of machine learning models imbued with physics knowledge, adversarial training for robustness to systematic uncertainty, the use of generative models in the physical sciences, and integration of reproducible workflows in the inference pipeline.

Mr. Prabhat (LBL/NERSC)
Lenka Zdeborová (CEA)
Atilim Gunes Baydin (University of Oxford)
Juan Carrasquilla (Vector Institute)

Juan Carrasquilla is a full-time researcher at the Vector Institute for Artificial Intelligence in Toronto, Canada, where he works on the intersection of condensed matter physics, quantum computing, and machine learning - such as combining quantum Monte Carlo simulations and machine learning techniques to analyze the collective behaviour of quantum many-body systems. He completed his PhD in Physics at the International School for Advanced Studies in Italy and has since held positions as a Postdoctoral Fellow at Georgetown University and the Perimeter Institute, as a Visiting Research Scholar at Penn State University, and was a Research Scientist at D-Wave Systems Inc. in Burnaby, British Columbia.

Emine Kucukbenli (Boston University)
Gilles Louppe (University of Liège)
Benjamin Nachman (Lawrence Berkeley National Laboratory)
Brian Nord (Fermi National Accelerator Laboratory)
Savannah Thais (Princeton University)

More from the Same Authors