Workshop
NeurIPS 2023 Workshop: Machine Learning and the Physical Sciences
Brian Nord · Atilim Gunes Baydin · Adji Bousso Dieng · Emine Kucukbenli · Siddharth Mishra-Sharma · Benjamin Nachman · Kyle Cranmer · Gilles Louppe · Savannah Thais
Hall B2 (level 1)
Physical sciences and machine learning: more than the sum of their parts. Join us to discuss research at the convergence of these fields!
Schedule
Fri 6:15 a.m. - 6:30 a.m.
|
Opening remarks
(
Introductory remarks by the organizers
)
>
SlidesLive Video |
🔗 |
Fri 6:30 a.m. - 6:55 a.m.
|
Benefits of Approximate and Partial Equivariance
(
Invited talk
)
>
SlidesLive Video |
Shubhendu Trivedi 🔗 |
Fri 6:55 a.m. - 7:20 a.m.
|
Interpretable deep learning for protein modeling
(
Invited talk
)
>
SlidesLive Video |
Maria Rodriguez Martinez · Maria Rodriguez Martinez 🔗 |
Fri 7:20 a.m. - 8:15 a.m.
|
Panel on inductive biases and interpretability
(
Panel discussion
)
>
SlidesLive Video |
Shubhendu Trivedi · Anuj Karpatne · Joshua Speagle · Savannah Thais 🔗 |
Fri 8:15 a.m. - 8:45 a.m.
|
Coffee break
(
Coffee break
)
>
|
🔗 |
Fri 8:45 a.m. - 9:00 a.m.
|
Removing Dust from CMB Observations with Diffusion Models
(
Contributed talk
)
>
SlidesLive Video |
David Heurtel-Depeiges 🔗 |
Fri 9:00 a.m. - 10:15 a.m.
|
Poster session 1
(
Poster session
)
>
link
Physical poster session: Hall B2 (level 1) Virtual poster session (GatherTown): [ protected link dropped ] |
🔗 |
Fri 10:00 a.m. - 11:30 a.m.
|
Lunch break
(
Lunch break
)
>
|
🔗 |
Fri 11:30 a.m. - 12:00 p.m.
|
What's missing? A speculative sketch of the future of machine learning and science
(
Invited talk
)
>
SlidesLive Video |
Alexander Alemi 🔗 |
Fri 12:00 p.m. - 1:15 p.m.
|
Poster session 2
(
Poster session
)
>
link
Physical poster session: Hall B2 (level 1) Virtual poster session (GatherTown): [ protected link dropped ] |
🔗 |
Fri 1:15 p.m. - 1:30 p.m.
|
Coffee break
(
Coffee break
)
>
|
🔗 |
Fri 1:30 p.m. - 1:45 p.m.
|
Towards an Astronomical Foundation Model for Stars
(
Contributed talk
)
>
SlidesLive Video |
Henry Leung 🔗 |
Fri 1:45 p.m. - 2:00 p.m.
|
KeyCLD: Learning Constrained Lagrangian Dynamics in Keypoint Coordinates from Images
(
Contributed talk
)
>
link
SlidesLive Video |
Rembert Daems 🔗 |
Fri 2:00 p.m. - 2:15 p.m.
|
Ultra Fast Transformers on FPGAs for Particle Physics Experiments
(
Contributed talk
)
>
SlidesLive Video |
Elham E Khoda 🔗 |
Fri 2:15 p.m. - 3:15 p.m.
|
Panel on institutional support and funding
(
Panel discussion
)
>
SlidesLive Video |
Jesse Thaler · Max Welling · John Wu · Sara Hooker 🔗 |
Fri 3:15 p.m. - 3:30 p.m.
|
Closing remarks
(
Closing remarks by the organizers
)
>
SlidesLive Video |
🔗 |
-
|
Control-aware echo state networks (Ca-ESN) for the suppression of extreme events
(
Poster
)
>
Extreme event are sudden large-amplitude changes in the state or observables of chaotic nonlinear systems, which characterize many scientific phenomena. Because of their violent nature, extreme events typically have adverse consequences, which call for methods to prevent the events from happening.In this work, we introduce the control-aware echo state network (Ca-ESN) to seamlessly combine ESNs and control strategies, such as proportional-integral-derivative and model predictive control, to suppress extreme events. The methodology is showcased on a chaotic-turbulent flow, in which we reduce the occurrence of extreme events with respect to traditional methods by two orders of magnitude. This works opens up new possibilities for the efficient control of nonlinear systems with neural networks. |
Alberto Racca · Luca Magri 🔗 |
-
|
KeyCLD: Learning Constrained Lagrangian Dynamics in Keypoint Coordinates from Images
(
Poster
)
>
We present KeyCLD, a framework to learn Lagrangian dynamics from images. Learned keypoints represent semantic landmarks in images and can directly represent state dynamics. We show that interpreting this state as Cartesian coordinates, coupled with explicit holonomic constraints,allows expressing the dynamics with a constrained Lagrangian. KeyCLD is trained unsupervised end-to-end on sequences of images. Our method explicitly models the mass matrix, potential energy and the input matrix,thus allowing energy based control. We demonstrate learning of Lagrangian dynamics from images on the dm_control pendulum, cartpole and acrobot environments.KeyCLD can be learned on these systems, whether they are unactuated, underactuated or fully actuated.Trained models are able to produce long-term video predictions, showing that the dynamics are accurately learned.We compare with Lag-VAE, Lag-caVAE and HGN, and investigate the benefit of the Lagrangian prior and the constraint function.KeyCLD achieves the highest valid prediction time on all benchmarks.Additionally, a very straightforward energy shaping controller is successfully applied on the fully actuated systems. |
Rembert Daems · · Francis Wyffels · Guillaume Crevecoeur 🔗 |
-
|
Incorporating Additive Separability into Hamiltonian Neural Networks for Regression and Interpretation
(
Poster
)
>
Hamiltonian neural networks are state-of-the-art models that regress the vector field of a dynamical system under the learning bias of Hamilton’s equations. A recent observation is that incorporating a second-order bias regarding the additive separability of the Hamiltonian reduces the regression complexity and improves regression performance. We propose three separable Hamiltonian neural networks that incorporate additive separability within Hamiltonian neural networks using observational, learning and inductive biases. We show that the proposed models are more effective than a Hamiltonian neural network at regressing a vector field, and have the capability to interpret the kinetic and potential energy of the system. |
Zi-Yu Khoo · Jonathan Sze Choong Low · Stéphane Bressan 🔗 |
-
|
Extracting an Informative Latent Representation of High-Dimensional Galaxy Spectra
(
Poster
)
>
We report the discovery of four latent variables that effectively capture the complexity of high-dimensional galaxy spectra from the Sloan Digital Sky Survey. We investigate the spectral ranges and the physical properties of galaxies that are most informative for explaining the observed spectra. Employing Variational Autoencoders (VAEs) and conditional VAEs, both being generative models proficient at capturing intricate details in high-dimensional data, we show that these four latent parameters provide more information than traditionally utilized physical properties such as stellar mass, Star Formation Rate, specific Star Formation Rate, and metallicity. We highlight that the spectral features providing the most insight include the range below 5000\AA and the wavelengths corresponding to the emission lines ([O II], [O III], and H$\alpha$). Our results indicate that we can construct a more efficient representation of galaxy spectra based on these latent parameters, which are more fundamental than currently acknowledged physical properties.
|
Daiki Iwasaki · Suchetha Cooray · Tsutomu Takeuchi 🔗 |
-
|
When Black-box PDE Solvers Meet Deep Learning: End-to-End Mesh Optimization for Efficient Fluid Flow Prediction
(
Poster
)
>
Deep learning has been widely applied to solve partial differential equations (PDEs) in computational fluid dynamics. Recent research proposed a PDE correction framework that leverages deep learning to correct the solution obtained by a PDE solver on a coarse mesh. However, end-to-end training of such a PDE correction model over both solver-dependent parameters such as mesh parameters and neural network parameters requires the PDE solver to support automatic differentiation through the iterative numerical process. Such a feature is not readily available in many existing solvers. In this study, we explore the feasibility of end-to-end training of a hybrid model with a black-box PDE solver and a deep learning model for fluid flow prediction. Specifically, we investigate a hybrid model that integrates a black-box PDE solver into a differentiable deep graph neural network. To train this model, we use a zeroth-order gradient estimator to differentiate the PDE solver via forward propagation. Experiments show that the proposed approach based on zeroth-order estimator produces correction models that outperform the baseline model trained using first-order method with a frozen input mesh to the solver. |
Shaocong Ma · James Diffenderfer · Bhavya Kailkhura · Yi Zhou 🔗 |
-
|
Physics-consistency of infinite neural networks
(
Poster
)
>
Recent work demonstrates the integration of physics prior knowledge into neural networks through neural activation functions and the infinite-width correspondence to Gaussian processes, provided the Central Limit Theorem holds. Together with the construction of physics-consistent Gaussian process kernels, former connection begs the question for physics-consistent infinite neural networks. So construed regression models find specialized applications such as inverse problems, uncertainty quantification, and optimization, particularly in data-scarce situations. These 'surrogate' models can efficiently learn from limited data while maintaining physical consistency. |
Sascha Ranftl 🔗 |
-
|
Pay Attention to Mean Fields for Point Cloud Generation
(
Poster
)
>
Collider data generation via machine learning is gaining traction in particle physics due to the computational cost of traditional Monte Carlo simulations, especially for future high-luminosity colliders. This study presents a model using linearly scaling attention-based aggregation. The model is trained in an adversarial setup, ensuring input permutation equivariance respective invariance for the generator and critic, respectively. A feature matching loss is introduced to stabilise known unstable adversarial training. Results are presented for two different datasets. On the \textsc{JetNet150} dataset, the model is competitive but more parameter-efficient than the current state-of-the-art GAN-based model. The model has been extended to handle the CaloChallenge Dataset 2, where each point cloud contains up to $30\times$ more points than for the previous dataset. The model and its corresponding code will be made available upon publication.
|
Benno Käch · Isabell Melzer · Dirk Krücker 🔗 |
-
|
Simulation-based Inference for Cardiovascular Models
(
Poster
)
>
Over the past decades, hemodynamics simulators have become tools of choice for studying cardiovascular systems and are routinely used to simulate whole-body hemodynamics from physiological parameters. Nevertheless, solving the corresponding inverse problem of mapping waveforms back to plausible physiological parameters remains challenging.Motivated by advances in simulation-based inference (SBI), we cast this inverse problem as statistical inference. Our study highlights the potential of estimating new biomarkers from standard-of-care measurements and reveals practically relevant findings that cannot be captured by standard sensitivity analyses, such as the existence of sub-populations for which parameter estimation exhibits distinct uncertainty regimes. In addition, we study how such insights obtained in-silico transfer to in-vivo with the MIMIC-III database. |
Antoine Wehenkel · Jens Behrmann · Andy Miller · Guillermo Sapiro · Ozan Sener · Marco Cuturi · Joern-Henrik Jacobsen 🔗 |
-
|
Fast SoC thermal simulation with physics-aware U-Net
(
Poster
)
>
Fast thermal simulation for System on Chip (SoC) plays a crucial role in integrated circuit (IC) design industry, particularly as power density escalates with increasing computational requirements. It is imperative to assess thermal performance comprehensively during the design phase, utilizing a rapid and precise thermal simulator to expedite design iterations. In this paper, we introduce a fast, physics-aware thermal simulator that draws inspiration from Fourier's law and the Fourier-Biot equation, which correspond to the first and second derivatives of the temperature map. Consequently, the learning objective evolves from merely translating images to approximating natural phenomena such as the thermal gradient and thermal laplacian. By replacing the image-based loss with thermal-aware loss, the proposed model achieves lower prediction error, higher data efficiency, and more physically accurate behavior. The present model demonstrates a significant improvement, achieving a 34% reduction in Maximum Temperature Error (MTE), showcasing the potential for integrating physics-aware learning into SoC thermal design. |
Yu-Sheng Lin · Li-Song Lin · Chin-Jui Chang · Ting-Yu Lin · Shih-Hong Pan · Ya-Wen Yu · Kai-En Yang · Wei Cheng Lee · Yi-Chen Lin · Tai-Yu Chen · Jason Yeh
|
-
|
Unsupervised segmentation of irradiation-induced order–disorder phase transitions in electron microscopy
(
Poster
)
>
We present a method for the unsupervised segmentation of electron microscopy images, which are powerful descriptors of materials and chemical systems. Images are oversegmented into overlapping chips, and similarity graphs are generated from embeddings extracted from a domain-pretrained convolutional neural network (CNN). The Louvain method for community detection is then applied to perform segmentation. The graph representation provides an intuitive way of presenting the relationship between chips and communities. We demonstrate our method to track irradiation-induced amorphous fronts in thin films used for catalysis and electronics. This method has potential for "on-the-fly'' segmentation to guide emerging automated electron microscopes. |
Arman Ter-Petrosyan · Jenna A Bilbrey · Christina Doty · Bethany Matthews · Le Wang · Yingge Du · Eric Lang · Khalid Hattar · Steven Spurgeon 🔗 |
-
|
Attention-guided neural differential equations for physics-constrained deep learning of ion transport
(
Poster
)
>
Species transport models typically combine partial differential equations (PDEs) with relations from hindered transport theory to quantify electromigrative, convective, and diffusive transport through complex nanoporous systems; however, these formulations are frequently substantial simplifications of the governing dynamics, leading to the poor generalization performance of PDE-based models. Given the growing interest in deep learning methods for the physical sciences, we develop a machine learning-based approach to characterize ion transport across nanoporous membranes. Our proposed framework centers around attention-guided neural differential equations that incorporate electroneutrality-based inductive biases to improve generalization performance relative to conventional PDE-based methods. In addition, we study the role of the attention mechanism in illuminating physically-meaningful ion-pairing relationships across diverse mixture compositions. Further, we investigate the importance of pre-training on synthetic data from PDE-based models, as well as the performance benefits from hard vs. soft inductive biases. Our results indicate that physics-informed deep learning solutions can outperform their classical PDE-based counterparts and provide promising avenues for modelling complex transport phenomena across diverse applications. |
Danyal Rehman · John Lienhard 🔗 |
-
|
Learning Closure Relations using Differentiable Programming: An Example in Radiation Transport
(
Poster
)
>
link
The continuous flow or `transport' of a macroscopic system of particles is a high dimensional problem and therefore often solved using reduced order models. This necessarily introduces unknown closure relations into these models. In this work, we present a machine learning approach to finding accurate closure relations utilising differentiable programming. As a case study, we consider the transport of photons and use a literature radiation transport test problem as a training dataset. We present novel ML closures for a number of reduced order models which out-perform their literature counterparts in both trained and unseen problems. |
Aidan Crilly · Benjamin Duhig · Nacime Bouziani 🔗 |
-
|
DFT Hamiltonian Neural Network Training with Semi-supervised Learning
(
Poster
)
>
Recent efforts have focused on training neural networks to replace density functional theory (DFT) calculations. However, prior neural network training methods required an extensive number of DFT simulations to obtain the ground truth (Hamiltonians). Conversely, when working with limited training data, deep learning models often exhibit increased errors in predicting Hamiltonians and band structures for testing data. This phenomenon carries the potential risk of yielding inaccurate physical interpretations, including the emergence of unphysical branches within band structures. To address this challenge, we introduce a novel deep learning-based method for calculating DFT Hamiltonians, specifically designed to generate accurate results with limited training data. Our framework not only employs supervised learning with the calculated Hamiltonian but also generates pseudo Hamiltonians (targets for unlabeled data) and trains the neural networks on unlabeled data. We compare our results with those obtained using the state-of-the-art method, which trains neural networks using atomic structures as inputs and DFT Hamiltonians as targets. We demonstrate the superior performance of our framework compared to the previous approach on various datasets, such as MoS2, Bi2Te3, HfO2, and InGaAs. |
Yucheol Cho · Guenseok Choi · Gyeongdo Ham · Mincheol Shin · Dae-Shik Kim 🔗 |
-
|
CaloLatent: Score-based Generative Modelling in the Latent Space for Calorimeter Shower Generation
(
Poster
)
>
Fast calorimeter simulation is crucial for collider physics to accelerate the comparisons between theory and experiments. Physics simulators are often precise but slow to generate the required high granular detector response. Fast surrogate models based on machine learning models have shown great promise by leveraging modern computational hardware and their ability to capture, the complex and high dimensional space of calorimeter detectors. In this paper we introduce a new fast surrogate model based on latent diffusion models named CaloLatent, able to reproduce, with high fidelity, the detector response in a fraction of the time required by similar generative models. We evaluate the generation quality and speed using the Calorimeter Simulation Challenge 2022 dataset. |
Thandikire Madula · Vinicius Mikuni 🔗 |
-
|
Predicting Galaxy Interloper Fraction with GNNs
(
Poster
)
>
Upcoming emission line spectroscopic surveys, such as Euclid and the Roman Space Telescope, will be prone to systematics due to the presence of interlopers: galaxies whose redshift and distance from us are miscalculated due to line confusion in their emission spectra. Particularly pernicious are interlopers involving the confusion between two lines with close emitted wavelengths, since these interlopers correlate with the target galaxies. An interesting example is H$\beta$ emitters confused as \oiii\ emitters. They introduce a particular pattern in the 3D distribution of the observed galaxy catalog that can bias the cosmological analysis performed with that sample. We present a novel method to predict the fraction of interlopers in a galaxy catalog, using simulations and halos as a proxy for galaxies. This method uses Graph Neural Networks to learn the posterior distribution of the interloper fraction while marginalizing over cosmological and astrophysics unknowns.
|
Elena Massara · Francisco Villaescusa · Will Percival 🔗 |
-
|
ML-Enhanced Generalized Langevin Equation for Transient Anomalous Diffusion in Polymer Dynamics
(
Poster
)
>
link
In this work, we introduce an ML framework to generate long-term single-polymer dynamics by exploiting short-term trajectories from molecular dynamics (MD) simulations of homopolymer melts. Even with current advances in machine learning for MD, these polymeric materials are difficult to simulate and characterize due to prohibitive computational costs when long timescales are involved. Our method relies on a 3D neural autoregressive (NAR) model for collective variables (CVs), which enhances the Generalized Langevin Equation capabilities in modeling diffusion phenomena. ML-GLE is capable of reproducing long-term single polymer statistical properties, predicting the diffusion coefficient, and resulting in an enormous acceleration in terms of simulation time. Moreover, it is also scalable with system size. |
Gian-Michele Cherchi · Alain Dequidt · Patrice Hauret · Arnaud Guillin · Vincent Barra · Nicolas Martzel 🔗 |
-
|
Ensemble models outperform single model uncertainties and predictions for operator-learning of hypersonic flows
(
Poster
)
>
High-fidelity computational simulations and physical experiments of hypersonic flows are resource intensive. Training scientific machine learning (SciML) models limited high-fidelity data offers one approach to rapidly predict behaviors for situations that have not been seen before. However, high-fidelity data is itself in limited quantity to validate all outputs of the SciML model in unexplored input space. As such, an uncertainty-aware SciML model is desired. The SciML model’s output uncertainties could then be used to assess the reliability and confidence of the model’s predictions. In this study, we extend a deep operator network (DeepONet) using three different uncertainty quantification mechanisms: mean-variance estimation (MVE), evidential uncertainty, and ensembling. The uncertainty aware DeepONet models are trained and evaluated on the hypersonic flow around a blunt cone object with data generated via computational fluid dynamics over a wide range of Mach numbers and altitudes. We find that ensembling outperforms the other two uncertainty models in terms of minimizing error and calibrating uncertainty in both interpolative and extrapolative regimes. |
Victor Leon · Noah Ford · Honest Mrema · Jeffrey Gilbert · Alexander New 🔗 |
-
|
A Multi-Grained Group Symmetric Framework for Learning Protein-Ligand Binding Dynamics
(
Poster
)
>
In drug discovery, molecular dynamics (MD) simulation for protein-ligand binding provides a powerful tool for predicting binding affinities, estimating transport properties, and exploring pocket sites. While significant strides have been made in advancing more efficient MD simulations, the accurate modeling of extended-timescales simulations remains a considerable challenge. To address this issue, we propose NeuralMD, the first approach to learning to predict binding dynamics. It builds upon a novel multi-grained group symmetric framework and effectively incorporates the physics laws from two perspectives: (1) The geometric representation of the protein-ligand complex should be SE(3)-equivariant and can sufficiently capture the particle interplay between protein and ligand, and (2) The trajectory learning for binding dynamics needs to follow Newtonian mechanics. To verify the effectiveness of NeuralMD, we design ten single-trajectory and three multi-trajectory binding simulation tasks. Quantitatively, NeuralMD outperforms three competitive machine learning baselines for binding MD simulation. |
Shengchao Liu · weitao du · Yanjing Li · Nakul Rampal · Zhuoxinran Li · Vignesh Bhethanabotla · Omar Yaghi · Christian Borgs · Anima Anandkumar · Hongyu Guo · Jennifer Chayes
|
-
|
Discovering Black Hole Mass Scaling Relations with Symbolic Regression
(
Poster
)
>
Our knowledge of supermassive black holes (SMBHs) and their relation to their host galaxies is still limited, and there are only around 150 SMBHs that have their masses directly measured and confirmed. Better black hole mass scaling relations will help us reveal the physics of black holes, as well as predict black hole masses that are not yet measured. Here, we apply symbolic regression, combined with random forest to those directly-measured black hole masses and host galaxy properties, and find a collection of higher-dimensional (N-D) black hole mass scaling relations. These N-D black hole mass scaling relations have scatter smaller than any of the existing black hole mass scaling relations. One of the best among them involves the parameters of central stellar velocity dispersion, bulge-to-total ratio, and density at the black hole's sphere-of-influence with an intrinsic scatter of $\epsilon=0.083\,\ \textup{dex}$, significantly lower than $\epsilon \sim 0.3\,\textup{dex}$ for the M-$\sigma$ relation. These relations will inspire black hole physics, test black hole models implemented in simulations, and estimate unknown black hole masses on an unprecedented precision.
|
Zehao Jin · Benjamin Davis 🔗 |
-
|
Hydrogen Diffusion through Polymer using Deep Reinforcement Learning
(
Poster
)
>
Robust and cost-effective hydrogen storage is considered as an enabling technology for carbon-free and renewable energy society. Hydrogen tank using polymer liner has been in market and already used in fuel cell electric vehicles and airplanes. Understanding of the fundamental mechanisms of hydrogen diffusion in polymer could greatly speed up the deployment of hydrogen energy infrastructure at scale. A computational framework that provides atomistic diffusion pathways at experimentally relevant time scale is ideal for this purpose, however, it is yet to be demonstrated. We have developed a novel deep reinforcement learning framework combined with transition state theory to efficiently identify molecular diffusion pathways in polymeric materials. Employing distributed replay buffer, an ensemble of agents quickly learns the complex energy landscape of the system of interest. Subsequently, the diffusion time of each pathway is estimated using transition state theory. With the distributed training framework we have achieved significant improvement in learning in terms of both the training metrics as well as the molecular diffusion time. |
Tian Sang · Ken-ichi Nomura · Aiichiro Nakano · Rajiv Kalia · Priya Vashishta 🔗 |
-
|
Nonlinear-manifold reduced order models with domain decomposition
(
Poster
)
>
A nonlinear-manifold reduced order model (NM-ROM) is a great way of incorporating underlying physics principles into a neural network-based data-driven approach. We combine NM-ROMs with domain decomposition (DD) for efficient computation. NM-ROMs offer benefits over linear-subspace ROMs (LS-ROMs) but can be costly to train due to parameter scaling with the full-order model (FOM) size. To address this, we employ DD on the FOM, compute subdomain NM-ROMs, and then merge them into a global NM-ROM. This approach has multiple advantages: parallel training of subdomain NM-ROMs, fewer parameters than global NM-ROMs, and adaptability to subdomain-specific FOM features. Each subdomain NM-ROM uses a shallow, sparse autoencoder, enabling hyper-reduction (HR) for improved computational speed. In this paper, we detail an algebraic DD formulation for the FOM, train HR-equipped NM-ROMs for subdomains, and numerically compare them to DD LS-ROMs with HR. Results show a significant accuracy boost, on the order of magnitude, for the proposed DD NM-ROMs over DD LS-ROMs in solving the 2D steady-state Burgers' equation. |
Alejandro Diaz · Youngsoo Choi · Matthias Heinkenschloss 🔗 |
-
|
A Multimodal Dataset and Benchmark for Radio Galaxy and Infrared Host Detection
(
Poster
)
>
Creating scientific catalogues requires identifying the radio galaxy components and their corresponding infrared hosts.In this paper, we present a novel multimodal dataset developed by expert astronomers to automate the detection and localisation of multi-component extended radio galaxies and their corresponding infrared hosts.The dataset comprises 4,155 instances of galaxies in 2,800 images with both radio and infrared modalities. Each instance contains information on the extended radio galaxy class, its corresponding bounding box that encompasses all of its components, pixel-level segmentation mask, and the position of its corresponding infrared host galaxy. Our dataset is the first publicly accessible dataset that includes images from a highly sensitive radio telescope, infrared satellite, and instance-level annotations for their identification. We benchmark several object detection algorithms on the dataset and propose a novel multimodal fusion approach to identify radio galaxies and the positions of infrared hosts simultaneously. |
Nikhel Gupta 🔗 |
-
|
PINNs-TF2: Fast and User-Friendly Physics-Informed Neural Networks in TensorFlow V2
(
Poster
)
>
link
Physics-informed neural networks (PINNs) have gained prominence for their capability to tackle supervised learning tasks that conform to physical laws, notably nonlinear partial differential equations (PDEs). This paper presents "PINNs-TF2", a Python package built on the TensorFlow V2 framework. It not only accelerates PINNs implementation but also simplifies user interactions by abstracting complex PDE challenges. We underscore the pivotal role of compilers in PINNs, highlighting their ability to boost performance by up to 119x. Across eight diverse examples, our package, integrated with XLA compilers, demonstrated its flexibility and achieved an average speed-up of 18.12 times over TensorFlow V1. Moreover, a real-world case study is implemented to underscore the compilers' potential to handle many trainable parameters and large batch sizes. For community engagement and future enhancements, our package's source code is openly available at: \texttt{link}. |
Reza Akbarian Bafghi · Maziar Raissi 🔗 |
-
|
Extending Explainable Boosting Machines to Scientific Image Data
(
Poster
)
>
As the deployment of computer vision technology becomes increasingly common in science, the need for explanations of the system output has become a focus of great concern. Driven by the pressing need for interpretable models in science, we propose the use of Explainable Boosting Machines (EBMs) for scientific image data. Inspired by an important application underpinning the development of quantum technologies, we apply EBMs to cold-atom soliton image data tabularized using Gabor Wavelet Transform-based techniques that preserve the spatial structure of the data. In doing so, we demonstrate the use of EBMs for image data for the first time and show that our approach provides explanations that are consistent with human intuition about the data. |
Daniel Schug · Sai Yerramreddy · Rich Caruana · Craig Greenberg · Justyna Zwolak 🔗 |
-
|
Fast Detection of Phase Transitions with Multi-Task Learning-by-Confusion
(
Poster
)
>
Machine learning has been successfully used to study phase transitions. One of the most popular approaches to identifying critical points from data without prior knowledge of the underlying phases is the \emph{learning-by-confusion} scheme. As input, it requires system samples drawn from a grid of the parameter whose change is associated with potential phase transitions. Up to now, the scheme required training a distinct binary classifier for each possible splitting of the grid into two sides, resulting in a computational cost that scales linearly with the number of grid points. In this work, we propose and showcase an alternative implementation that only requires the training of a \emph{single} multi-class classifier. Ideally, such multi-task learning eliminates the scaling with respect to the number of grid points. In applications to the Ising model and an image dataset generated with Stable Diffusion, we find significant speedups which, apart from small deviations, correspond to this ideal case. |
Julian Arnold · Frank Schäfer · Niels Lörch 🔗 |
-
|
A Data-Driven, Non-Linear, Parameterized Reduced Order Model of Metal 3D Printing
(
Poster
)
>
Directed energy deposition (DED) is a promising metal additive manufacturingtechnology capable of 3D printing metal parts with complex geometries at lowercost compared to traditional manufacturing. The technology is most effectivewhen process parameters like laser scan speed and power are optimized for aparticular geometry and alloy. To accelerate optimization, we apply a data-driven,parameterized, non-linear reduced-order model (ROM) called Gaussian ProcessLatent Space Dynamics Identification (GPLaSDI) to physics-based DED simulationdata. With an appropriate choice of hyperparameters, GPLaSDI is an effectiveROM for this application, with a worst-case error of about 8% and a speed-up ofabout 1,000,000x with respect to the corresponding physics-based data. |
Aaron Brown · Eric Chin · Youngsoo Choi · Saad Khairallah · Joseph McKeown 🔗 |
-
|
Evaluating Physically Motivated Loss Functions for Photometric Redshift Estimation
(
Poster
)
>
Physical constraints have been suggested to make neural network models more generalizable, act scientifically plausible, and be more data-efficient over unconstrained baselines. In this report, we present preliminary work on evaluating the effects of adding soft physical constraints to computer vision neural networks trained to estimate the conditional density of redshift on input galaxy images for the Sloan Digital Sky Survey. We introduce physically motivated soft constraint terms that are not implemented with differential or integral operators. We frame this work as a simple ablation study where the effect of including soft physical constraints is compared to an unconstrained baseline. We compare networks using standard point estimate metrics for photometric redshift estimation, as well as metrics to evaluate how faithful our conditional density estimate represents the probability over the ensemble of our test dataset. We find no evidence that the implemented soft physical constraints are more effective regularizers than augmentation. |
Andrew Engel · Jan Strube 🔗 |
-
|
Variational quantum dynamics of two-dimensional rotor models
(
Poster
)
>
We present a numerical method to simulate the dynamics of continuous variable quantum many-body systems. Our approach is based on custom neural-network many-body quantum states. We focus on dynamics of two-dimensional quantum rotors and simulate large experimentally-relevant systems with using state-of-the-art sampling approaches based on Hamiltonian Monte Carlo. We demonstrate the method can access quantities like the return probability and vorticity oscillations after a quantum quench in two-dimensional systems of up to 64 (8x8) coupled rotors. Our approach can be used to perform previously unexplored non-equilibrium simulations bridging the gap between simulation and experiment. |
Matija Medvidović · Dries Sels 🔗 |
-
|
Pre-training strategy using real particle collision data for event classification in collider physics
(
Poster
)
>
This study aims to improve the performance of event classification in collider physics by introducing a pre-training strategy. Event classification is a typical problem in collider physics, where the goal is to distinguish the signal events of interest from background events as much as possible to search for new phenomena in nature. A pre-training strategy with feasibility to efficiently train the target event classification using a small amount of training data has been proposed. Real particle collision data were used in the pre-training phase as a novelty, where a self-supervised learning technique to handle the unlabeled data was employed. The ability to use real data in the pre-training phase eliminates the need to generate a large amount of training data by simulation and mitigates bias in the choice of physics processes in the training data. Our experiments using CMS open data confirmed that high event classification performance can be achieved by introducing a pre-trained model. This pre-training strategy provides a potential approach to save computational resources for future collider experiments and introduces a foundation model for event classification. |
Tomoe Kishimoto · Masahiro Morinaga · Masahiko Saito · Junichi Tanaka 🔗 |
-
|
Zephyr : Stitching Heterogeneous Training Data with Normalizing Flow for Photometric Redshift Inference
(
Poster
)
>
We present Zephyr, a novel method that integrates cutting-edge normalizing flow techniques into a mixture density estimation framework, enabling effective utilization of the heterogeneous training data for photometric redshift inference. Compared to previous methods, Zephyr demonstrates enhanced robustness for both point estimation and distribution reconstruction by leveraging normalizing flows for density estimation and incorporating careful uncertainty quantification. Moreover, Zephyr offers unique interpretability to disentangle contributions from multi-source training data, which can facilitate future weak lensing analysis by providing an additional quality assessment. As probabilistic generative deep learning techniques gain increasing prominence in astronomy, Zephyr may serve as an inspiration for handling miscellaneous dataset issues, achieving good interpretability, and robustly accounting for uncertainties in heterogeneous training data. |
Zechang Sun · Joshua Speagle · Song Huang · Yuan-Sen Ting · Zheng Cai 🔗 |
-
|
Data-Driven Autoencoder Numerical Solver with Uncertainty Quantification for Fast Physical Simulations
(
Poster
)
>
Traditional partial differential equation (PDE) solvers can be computationally expensive, which motivates the development of faster methods, such as reduced-order-models (ROMs). We present GPLaSDI, a hybrid deep-learning and Bayesian ROM. GPLaSDI trains an autoencoder on full-order-model (FOM) data and simultaneously learns simpler equations governing the latent space. These equations are interpolated with Gaussian Processes, allowing for uncertainty quantification and active learning, even with limited access to the FOM solver. Our framework is able to achieve up to 100,000 times speed-up and less than 7% relative error on fluid mechanics problems. |
Christophe Bonneville · Youngsoo Choi · Debojyoti Ghosh · Jon Belof 🔗 |
-
|
Uncovering Conformal Towers Using Deep Learning
(
Poster
)
>
Extracting the operator spectrum (conformal towers) of critical models with space-time dimensionality larger than 2 is a formidable numerical task, closely related to diagonalizing very large element-wise non-negative matrices. Here we demonstrate the ability of a new ML-based numerical tool (extended RSMI-NE) to tackle such problems. We focus on critical properties of the Ising-Higgs gauge theory in $(2+1)D$ along the self-dual line, which has recently been a subject of debate. We determine, for the first time, the low energy operator content of the associated field-theory. Our approach enables us to largely refute a standing conjecture about the universality class of this transition.
|
Lior Oppenheim · Zohar Ringel · Snir Gazit · Maciej Koch-Janusz 🔗 |
-
|
Incremental learning for physics-informed neural networks
(
Poster
)
>
This work proposes an incremental learning algorithm for physics-informed neural networks (PINNs), which have recently become a powerful tool for solving partial differential equations (PDEs). As demonstrated herein, by developing incremental PINNs (iPINNs) we can effectively mitigate training challenges associated with PINNs loss landscape optimization and learn multiple tasks (equations) sequentially without additional parameters for new tasks. Interestingly, we show that this also improves performance for every equation in the sequence. The approach is based on creating its own subnetwork for each PDE and allowing each subnetwork to overlap with previously learned subnetworks. We also show that iPINNs achieve lower prediction error than regular PINNs for two different scenarios: (1) learning a family of equations (e.g., 1-D convection PDE); and (2) learning PDEs resulting from a combination of processes (e.g., 1-D reaction-diffusion PDE). |
Aleksandr Dekhovich · Marcel Sluiter · David M.J. Tax · Miguel Bessa 🔗 |
-
|
GAMMA: Galactic Attributes of Mass, Metallicity, and Age Dataset
(
Poster
)
>
link
We introduce the GAMMA (Galactic Attributes of Mass, Metallicity, and Age) dataset, a comprehensive collection of galaxy data tailored for Machine Learning applications. This dataset offers detailed 2D maps and 3D cubes of 11 727 galaxies, capturing essential attributes: stellar age, metallicity, and mass. Together with the dataset we publish our code to extract any other stellar or gaseous property from the raw simulation suite to extend the dataset beyond these initial properties, ensuring versatility for various computational tasks. Ideal for feature extraction, clustering, and regression tasks, GAMMA offers a unique lens for exploring galactic structures through computational methods and is a bridge between astrophysical simulations and the field of scientific machine learning (ML). As a first benchmark, we apply Principal Component Analysis (PCA) on this dataset. We find that PCA effectively captures the key morphological features of galaxies with a small number of components. We achieve a dimensionality reduction by a factor of ~200 (~3 650) for 2D images (3D cubes) with a reconstruction accuracy below 5%. |
Tobias Buck · Ufuk Çakır 🔗 |
-
|
Differential optimisation for task- and constraint-aware design of particle detectors
(
Poster
)
>
We describe a software package, developed to optimise the geometrical layout and specifications of detectors designed for tomography by scattering of cosmic-ray muons. The software exploits differentiable programming for the modelling of muon interactions with detectors and scanned volumes, the inference of volume properties, and the optimisation cycle performing the loss minimisation. In doing so, we provide the first demonstration of end-to-end-differentiable and inference-aware optimisation of particle physics instruments. We study the performance of the software on a relevant benchmark scenario and discuss its potential applications. |
Giles Strong · Maxime Lagrange · Aitor Orio Alonso · Anna Bordignon · Florian Bury · tommaso dorigo · Andrea Giammanco · Mariam Safieldin · Jan Kieseler · Max Lamparth · Pablo Martinez · Federico Nardi · Pietro Vischia · Haitham Zaraket
|
-
|
Neural ODEs as a discovery tool to characterize the structure of the hot galactic wind of M82
(
Poster
)
>
Dynamic astrophysical phenomena are predominantly described by differential equations, yet our understanding of these systems is constrained by our incomplete grasp of non-linear physics and scarcity of of comprehensive datasets. As such, advancing techniques in solving non-linear inverse problems becomes pivotal to addressing numerous outstanding questions in the field. In particular, modeling hot galactic winds is difficult because of unknown structure for various physical terms, and the lack of \textit{any} kinematic observational data. Additionally, the flow equations contain singularities that lead to numerical instability, making parameter sweeps non-trivial. We leverage differentiable programming, which enables neural networks to be embedded as individual terms within the governing coupled ordinary differential equations (ODEs), and show that this method can adeptly discover hidden physics. We robustly discern the structure of a mass-loading function which captures the physical effects of cloud destruction and entrainment into the hot superwind. Within a supervised learning framework, we formulate our loss function anchored on the astrophysical entropy ($K \propto P/\rho^{5/3}$). Our results demonstrate the efficacy of this approach, even in the absence of kinematic data $v$. We then apply these models to real Chandra X-Ray observations of starburst galaxy M82, providing the first systematic description of mass-loading within the superwind. This work further highlights neural ODEs as a useful discovery tool with mechanistic interpretability in non-linear inverse problems. We make our code public at this GitHub repository.
|
Dustin Nguyen · Yuan-Sen Ting · Todd Thompson · Sebastian Lopez · Laura Lopez 🔗 |
-
|
Speeding up astrochemical reaction networks with autoencoders and neural ODEs
(
Poster
)
>
In astrophysics, solving complex chemical reaction networks is essential but computationally demanding due to the high dimensionality and stiffness of the ODE systems. Traditional approaches for reducing computational load are often specialized to specific chemical networks and require expert knowledge. This paper introduces a machine learning-based solution employing autoencoders for dimensionality reduction and a latent space neural ODE solver to accelerate astrochemical reaction network computations. Additionally, we propose a cost-effective latent space linear function solver as an alternative to neural ODEs. These methods are assessed on a dataset comprising 29 chemical species and 224 reactions. Our findings demonstrate that the neural ODE achieves a 55x speedup over the baseline model while maintaining significantly higher accuracy by up to two orders of magnitude reduction in relative error. Furthermore, the linear latent model enhances accuracy and achieves a speedup of up to 4000x compared to standard methods. |
Tobias Buck · Immanuel Felix Sulzer 🔗 |
-
|
GalacticFlow: Learning a Generalized Representation of Galaxies with Normalizing Flows
(
Poster
)
>
State-of-the-art galaxy formation simulations generate data within weeks or months. Their results consist of a random sub-sample of possible galaxies with a fixed number of stars. We propose a ML based method, GalacticFlow, that generalizes such results. We use normalizing flows to learn the extended distribution function of galaxies conditioned on global galactic parameters. GalacticFlow then provides a continuized and condensed representation of the ensemble of galaxies in the data. Thus, essentially compressing large amounts of explicit simulation data into a small implicit generative model. Our model is able to sample any galaxy eDF given by a set of global parameters and allows generating arbitrarily many stars from it. We show that we can learn such a representation, embodying the entire mass range from dwarf to Milky Way mass, from only 90 galaxies in $\sim18$ hours on a single RTX 2080Ti and generate a new galaxy of one million stars within a few seconds.
|
Tobias Buck · Luca Wolf 🔗 |
-
|
PACuna: Automated Fine-Tuning of Language Models for Particle Accelerators
(
Poster
)
>
Navigating the landscape of particle accelerators has become increasingly challenging with recent surges in contributions. These intricate devices challenge comprehension, even within individual facilities.To address this, we introduce PACuna, a fine-tuned language model refined through publicly available accelerator resources like conferences, pre-prints, and books.We automated data collection and question generation to minimize expert involvement and make the data publicly available.PACuna demonstrates proficiency in addressing intricate accelerator questions, validated by experts.Our approach shows adapting language models to scientific domains by fine-tuning technical texts and auto-generated corpora capturing the latest developments can further produce pre-trained models to answer some intricate questions that commercially available assistants cannot and can serve as intelligent assistants for individual facilities. |
Antonin Sulc · Raimund Kammering · Annika Eichler · Tim Wilksen 🔗 |
-
|
Graph-Theoretical Approaches for AI-Driven Discovery in Quantum Optics
(
Poster
)
>
Emerging findings in the physical sciences frequently present new avenues for AI applications that can enhance its efficiency or broaden its scope, as we demonstrated in our study on quantum optics. We present a method that represents quantum optics experiments as abstract weighted graphs, converting problems that encompass both continuous and discrete elements into purely continuous optimization tasks. This allows efficient use of both gradient-based and neural network methods, circumventing the need for workarounds due to the discrete nature of the problems. The new representation not only simplifies the design process but also facilitates a deeper understanding and interpretation of strategies derived from neural networks. |
Xuemei Gu · Carlos Ruiz-Gonzalez · Sören Arlt · Tareq Jaouni · Jan Petermann · Sharareh Sayyad · Ebrahim Karimi · Nora Tischler · Mario Krenn 🔗 |
-
|
Direct Amortized Likelihood Ratio Estimation
(
Poster
)
>
We introduce a new amortized likelihood ratio estimator for likelihood-free simulation-based inference (SBI). Our estimator is simple to train and estimates the likelihood ratio using a single forward pass of the neural estimator. Our approach directly computes the likelihood ratio between two competing parameter sets which is different from the previous approach of comparing two neural network output values. We refer to our model as the direct neural ratio estimator (DNRE). As part of introducing the DNRE, we derive a corresponding Monte Carlo estimate of the posterior. We benchmark our new ratio estimator and compare to previous ratio estimators in the literature. We show that our new ratio estimator often outperforms these previous approaches. As a further contribution, we introduce a new derivative estimator for likelihood ratio estimators that enables us to compare likelihood-free Hamiltonian Monte Carlo (HMC) with random-walk Metropolis-Hastings (MH). We show that HMC is equally competitive, which has not been previously shown. Finally, we include a novel real-world application of SBI by using our neural ratio estimator to design a quadcopter. |
Adam Cobb · Brian Matejek · Daniel Elenius · Anirban Roy · Susmit Jha 🔗 |
-
|
Physics-informed neural networks with unknown measurement noise
(
Poster
)
>
Physics-informed neural networks (PINNs) constitute a flexible approach to both finding solutions and identifying parameters of partial differential equations. Most works on the topic assume noiseless data, or data contaminated by weak Gaussian noise. We show that the standard PINN framework breaks down in case of non-Gaussian noise. We give a way of resolving this fundamental issue and we propose to jointly train an energy-based model (EBM) to learn the correct noise distribution. We illustrate the improved performance of our approach using multiple examples. |
Philipp Pilar · Niklas Wahlström 🔗 |
-
|
Universal Semantic-less Texture Boundary Detection for Microscopy (and Metallography)
(
Poster
)
>
The automated analysis of textures has always been a topic of importance in metallographic imaging in particular and in microscopy in general. Those analyzed textures are used in a variety of applications, and as such, texture analysis is at the backbone of most, if not all, other vision tasks. However, the task of texture analysis greatly differs from related and well-defined tasks such as edge, contour, and semantic analysis for detection and segmentation. As texture perception is hard even for humans to define and includes a subjective outlook, computerized texture-based segmentation in semantic-less and texture-oriented images has not been achieved so far. Moreover, it is difficult to apply recent computer vision algorithms to such images because of database shortages in this domain, as well as a shortage of accurately labeled data.Therefore, we wish to develop a Universal Texture Representation (UTR). This representation will allow us to segment any texture image in any domain and even develop a Universal Texture Boundary Detector (UTBD). Crucially, such algorithms should work on texture images (and even videos) that have no semantic meaning, such as metallographic textures; hence, the vast literature on edge, contour, and semantic segmentation can not be used as is in our context. Henceforth, we formulate and define our problem: Universal semantic-less texture boundary detection. A solution to this newly defined problem - which, in this work, we present the initial path towards on our Texture Boundary in Metallography (TBM) dataset - could be used in a variety of applications as is or as an enhancer to other closely related vision tasks. For example, it could help quickly segment new images based on single-click segmentation cues provided by the user, or it could help retrieve images with similar textures from past experiments. |
Matan Rusanovsky · Ofer Beeri · Shai Avidan · Gal Oren 🔗 |
-
|
Information bottleneck learns dominant transfer operator eigenfunctions in dynamical systems
(
Poster
)
>
A common task across the physical sciences is that of model reduction: given a high-dimensional and complex description of a full system, how does one reduce it to a small number of important collective variables? Here we investigate model reduction for dynamical systems using the information bottleneck framework. We show that the optimal compression of a system's state is achieved by encoding spectral properties of its transfer operator. After demonstrating this in analytically-tractable examples, we show our findings hold also in variational compression schemes using experimental fluids data. These results shed light into the latent variables in certain neural network architectures, and show the practical utility of information-based loss functions. |
Matthew S. Schmitt · Maciej Koch-Janusz · Michel Fruchart · Daniel Seara · Vincenzo Vitelli 🔗 |
-
|
Causa prima: cosmology meets causal discovery for the first time
(
Poster
)
>
In astrophysics, controlled experiments are typically impossible, and it is then necessary to make the most of observational data.Other disciplines that are in a similar predicament --- from epidemiology to economics --- increasingly leverage causal inference methods.This is however not yet the case in astrophysics. In this contribution, we apply causal discovery for the first time to an important open problem in astrophysics, namely the possible coevolution of supermassive black holes (SMBHs) and their host galaxies.We make use of a comprehensive catalog of observed galaxy properties, on which we apply the Peter-Clark (PC) algorithm to obtain a single completed partially directed acyclic graph (CPDAG), representing a Markov equivalence class over directed acyclic graphs (DAGs). We test the robustness of our analysis by randomly subsampling our dataset and showing that we recover similar results.We suggest a physical explanation for the causal structure that we learned in terms of the hierarchical assembly pathway of SMBHs. |
Mario Pasquato · Zehao Jin · Pablo Lemos · Benjamin Davis · Andrea Macciò 🔗 |
-
|
Unraveling the Mysteries of Galaxy Clusters: Recurrent Inference Deconvolution of X-ray Spectra
(
Poster
)
>
In the realm of X-ray spectral analysis, the true nature of spectra has remained elusive, as observed spectra have long been the outcome of convolution between instrumental response functions and intrinsic spectra. In this study, we employ a recurrent neural network framework, the Recurrent Inference Machine (RIM), to achieve the unprecedented deconvolution of intrinsic spectra from instrumental response functions. Our RIM model is meticulously trained on cutting-edge thermodynamic models and authentic response matrices sourced from the Chandra X-ray Observatory archive. Demonstrating remarkable accuracy, our model successfully reconstructs intrinsic spectra well below the 1-$\sigma$ error level. We showcase the practical application of this novel approach through real Chandra observations of the galaxy cluster Abell 1550—a vital calibration target for the recently launched X-ray telescope, XRISM This pioneering work marks a significant stride in the domain of X-ray spectral analysis, offering a promising avenue for unlocking hitherto concealed insights into spectra.
|
Carter Rhea · Julie Hlavacek-Larrondo · Ralph Kraft · Akos Bogdan · Laurence Perreault-Levasseur · Alexandre Adam · John Zuhone 🔗 |
-
|
Scalable physics-guided data-driven component model reduction for Stokes flow
(
Poster
)
>
Stokes flow in repeated unit cell structures is extensively studied in applications for many natural and engineering processes. However, for large number of cells, resolving all scales can be prohibitively expensive using traditional numerical methods. To make the problem tractable, these methods often rely on volume-averaged approximation, resulting in accuracy issues. To address this, we propose a novel data-driven component model reduction approach that is constrained by the first-principle physics equation. This method employs reduced order modeling (ROM) to identify crucial physics modes in small-scale unit components and projects them onto the governing physics equation, creating a reduced model with essential physics details. We incorporate discontinuous Galerkin domain decomposition (DG-DD), enabling large-scale system construction without data at such vast scales. Applying this approach to incompressible Stokes flow equation, we achieve nearly 100 times faster solutions with a relative error of ~1%, even at scales 1000 times larger than the original problem. |
Kevin Chung · Youngsoo Choi · Pratanu Roy · Thomas Moore · Thomas Roy · Tiras Y. Lin · Sarah Baker 🔗 |
-
|
Improving dispersive readout of a superconducting qubit by machine learning on path signature
(
Poster
)
>
One major challenge that arises from quantum computing is to implement fast, high-accuracy quantum state readout. For superconducting circuits, this problem reduces to a time series classification problem on readout signals. We propose that using path signature methods to extract features can enhance existing techniques for quantum state discrimination. We demonstrate the superior performance of our proposed approach over conventional methods in distinguishing three different quantum states on real experimental data from a superconducting transmon qubit. |
Shuxiang Cao · Zhen Shao · Jian-Qing Zheng · Mustafa Bakr · Peter Leek · Terry Lyons 🔗 |
-
|
Optimizing Likelihood-free Inference using Self-supervised Neural Symmetry Embeddings
(
Poster
)
>
Likelihood-free inference is quickly emerging as a powerful tool to perform fast/effective parameter estimation. We demonstrate a technique of optimizing likelihood-free inference to make it even faster by marginalizing symmetries in a physical problem. In this approach, physical symmetries, for example, time-translation are learned using joint-embedding via self-supervised learning with symmetry data augmentations. Subsequently, parameter inference is performed using a normalizing flow where the embedding network is used to summarize the data before conditioning the parameters. We present this approach on two simple physical problems and we show faster convergence in a smaller number of parameters compared to a normalizing flow that does not use a pre-trained symmetry-informed representation. |
Deep Chatterjee · Philip Harris · Maanas Goel · Malina Desai · Michael Coughlin · Erik Katsavounidis 🔗 |
-
|
Removing Dust from CMB Observations with Diffusion Models
(
Poster
)
>
In cosmology, the quest for primordial $B$-modes in cosmic microwave background (CMB) observations has highlighted the critical need for a refined model of the Galactic dust foreground. We investigate diffusion-based modeling of the dust foreground and their interest for component separation. Under the assumption of a Gaussian CMB with known cosmology (or covariance matrix), we show that diffusion models can be trained on examples of dust emission maps such that their sampling process directly coincides with posterior sampling in the context of component separation. We illustrate this on simulated mixtures of dust emission and CMB. We show that common summary statistics (power spectrum, Minkowski functionals) of the components are well recovered by this process. We also introduce a model conditioned by the CMB cosmology that outperforms models trained using a single cosmology on component separation. Such a model will be used in future work for diffusion-based cosmological inference.
|
David Heurtel-Depeiges · Blakesly Burkhart · Ruben Ohana · Bruno Régaldo-Saint Blancard 🔗 |
-
|
$\rho$-Diffusion: A diffusion-based density estimation framework for computational physics
(
Poster
)
>
In physics, density $\rho(\cdot)$ is a fundamentally important scalar function to model, since it describes a scalar field or a probability density function that governs a physical process. Modeling $\rho(\cdot)$ typically scales poorly with parameter space, however, and quickly becomes prohibitively difficult and computationally expensive. One promising avenue to bypass this is to leverage the capabilities of denoising diffusion models often used in high-fidelity image generation to parameterize $\rho(\cdot)$ from existing scientific data, from which new samples can be trivially sampled from. In this paper, we propose \textsc{$\rho$-Diffusion}, an implementation of denoising diffusion probabilistic models for multidimensional density estimation in physics, which is currently in active development and, from our results, performs well on physically motivated 2D and 3D density functions. Moreover, we propose a novel hashing technique that allows \textsc{$\rho$-Diffusion} to be conditioned by arbitrary amounts of physical parameters of interest.
|
Maxwell Xu Cai · Kin Long Kelvin Lee 🔗 |
-
|
Transformers for Scattering Amplitudes
(
Poster
)
>
We pursue the use of Transformers to extend state-of-the-art results in theoretical particle physics. Specifically, we use Transformers to predict the integer coefficients of large mathematical expressions that represent scattering amplitudes in planar N=4 Yang-Mills theory, which is a quantum field theory closely related to the theory that describes Higgs boson production at the Large Hadron Collider. We first formulate the physics problem in a language-based representation that is amenable to Transformer architectures and standard training objectives. Then we show that the model can achieve high accuracy (>98%) on two tasks. |
Garrett Merz · Francois Charton · Tianji Cai · Kyle Cranmer · Lance Dixon · Niklas Nolte · Matthias Wilhelm 🔗 |
-
|
NeuralHMC: Accelerated Hamiltonian Monte Carlo with a Neural Network Surrogate Likelihood
(
Poster
)
>
Bayesian Inference with Markov Chain Monte Carlo requires the ability to efficiently compute the likelihood function. In some scientific applications, the likelihood can only be computed by a numerical PDE solver, which can be prohibitively expensive. We demonstrate that some such problems can be made tractable by amortizing the computation with a surrogate likelihood function implemented by a neural network. This can have the added benefits of reducing noise in the likelihood evaluations and providing fast gradient calculations. We demonstrate these advantages in a model of heliospheric transport of galactic cosmic rays, where our approach enables us to estimate the posterior of five latent parameters of the Parker equation. |
Linnea Wolniewicz · Peter Sadowski · Claudio Corti 🔗 |
-
|
Discovering Galaxy Features via Dataset Distillation
(
Poster
)
>
In many applications, Neural Nets (NNs) have classification performance on par or even exceeding human capacity. Moreover, it is likely that NNs leverage underlying features that might differ from those humans perceive to classify. Can we |
Haowen Guan · Xuan Zhao · Zishi Wang · Zhiyang Li · Julia Kempe 🔗 |
-
|
Modeling Coupled 1D PDEs of Cardiovascular Flow with Spatial Neural ODEs
(
Poster
)
>
Tackling coupled sets of partial differential equations (PDEs) through scientific machine learning presents a complex challenge, but it is essential for developing data-driven physics-based models. We employ a novel approach to model the coupled PDEs that govern the blood flow in stenosed arteries with deformable walls, while incorporating realistic inlet flow waveforms. We propose a low-dimensional model based on neural ordinary differential equations (ODEs) inspired by 1D blood flow equations. Our unique approach formulates the problem as ODEs in space rather than time, effectively overcoming issues related to time-dependent boundary conditions and PDE coupling. This innovative framework accurately captures flow rate and area variations, even when extrapolating to unseen waveforms. The promising results from this approach offer a different perspective on deploying neural ODEs to model coupled PDEs with unsteady boundary conditions, which are prevalent in many engineering applications. |
Hunor Csala · Arvind Mohan · Daniel Livescu · Amirhossein Arzani 🔗 |
-
|
Generating Multiphase Fluid Configurations in Fractures using Diffusion Models
(
Poster
)
>
Pore-scale simulations accurately describe transport properties of fluids in the subsurface. These simulations enhance our understanding of applications such as assessing hydrogen storage efficiency and forecasting CO$_2$ sequestration processes in underground reservoirs. Nevertheless, these simulations are computationally expensive due to their mesoscopic nature. In addition, their stationary solutions are not guaranteed to be unique, so multiple runs with different initial conditions must be performed to ensure sufficient sample coverage. These factors complicate the task of obtaining a representative and reliable forecast from pore-scale simulations. To address the high computational cost, we propose a hybrid method that couples generative diffusion models and physics-based modeling. Upon training a generative model, we synthesize samples that serve as the initial conditions for physics-based simulations. We measure the relaxation time (to stationary solutions) of the simulations, which serves as a validation metric and early-stopping criterion. Our numerical experiments revealed that the hybrid method exhibits a speed-up of up to 8.2 times compared to commonly used initialization methods. This finding offers compelling initial support that the proposed diffusion model-based hybrid scheme has potentials to significantly decrease the time required for convergence of numerical simulations without compromising the physical robustness.
|
Jaehong Chung · Agnese Marcato · Eric Guiltinan · Tapan Mukerji · Yen Ting Lin · Javier E. Santos 🔗 |
-
|
Redefining Super-Resolution:Fine-mesh PDE predictions without classical simulations
(
Poster
)
>
In Computational Fluid Dynamics (CFD), coarse mesh simulations offer computational efficiency but often lack precision. Applying conventional super-resolution to these simulations poses a significant challenge due to the fundamental contrast between downsampling high-resolution images and authentically emulating low-resolution physics. The former method conserves more of the underlying physics, surpassing the usual constraints of real-world scenarios. We propose a novel definition of super-resolution tailored for PDE-based problems. Instead of simply downsampling from a high-resolution dataset, we use coarse-grid simulated data as our input and predict fine-grid simulated outcomes. Employing a physics-infused UNet upscaling method, we demonstrate its efficacy across various 2D-CFD problems such as discontinuity detection in Burger's equation, Methane combustion, and fouling in Industrial heat exchangers. Our method enables the generation of fine-mesh solutions bypassing traditional simulation, ensuring considerable computational saving and fidelity to the original ground truth outcomes. Through diverse boundary conditions during training, we further establish robustness of our method, paving the way for its broad applications in engineering and scientific CFD solvers. |
Rajat Sarkar · Ritam Majumdar · Vishal Jadhav · Sagar Srinivas Sakhinana · Venkataramana Runkana 🔗 |
-
|
Discovering Quantum Error Correcting Codes with Deep Reinforcement Learning
(
Poster
)
>
Reinforcement Learning (RL) stands out in finding optimal strategies for complex tasks without prior knowledge of the dynamics of the system. In this work, we use RL for automatically discovering Quantum Error Correction (QEC) codes and their encoding circuits for a given gate set and error model. Our approach is very general and our implementation is extremely efficient. In particular, we find all possible stabilizer codes and their encoding circuits that correct single errors for up to 15 qubits, for an illustrative gate set. All in all, our framework is a versatile tool for QEC across diverse quantum hardware platforms of interest. |
Jan Olle · Remmy Zen · Matteo Puviani · Florian Marquardt 🔗 |
-
|
Discovering Quantum Circuits for Logical State Preparation with Deep Reinforcement Learning
(
Poster
)
>
Quantum error correction (QEC) is important for the realization of fault-tolerant quantum computers. The first essential step of QEC is to encode the logical state into physical qubits. However, there is no unique recipe for finding a quantum circuit that encodes or prepares the logical state, especially for a given gate set and qubit connectivity. In this work, we use deep reinforcement learning to automatically discover quantum circuits to prepare the logical state of a QEC code given a gate set and qubit connectivity. We show that our method can prepare a logical state of up to $17$ physical qubits code in fully connected qubits and up to $9$ physical qubits code with IBM quantum devices gate set and connectivity with smaller circuit size than other methods.
|
Remmy Zen · Jan Olle · Matteo Puviani · Florian Marquardt 🔗 |
-
|
Differentiable Simulation of a Liquid Argon TPC for High-Dimensional Calibration
(
Poster
)
>
Liquid argon time projection chambers (LArTPCs) are widely used in particle detection for their tracking and calorimetric capabilities. The particle physics community actively builds and improves high-quality simulators for such detectors in order to develop physics analyses in a realistic setting. The fidelity of these simulators relative to real, measured data is limited by the modeling of the physical detectors used for data collection. This modeling can be improved by performing dedicated calibration measurements. Conventional approaches calibrate individual detector parameters or processes one at a time. However, the impact of detector processes is entangled, making this a poor description of the underlying physics. We introduce a differentiable simulator that enables a gradient-based optimization, allowing for the first time a simultaneous calibration of all detector parameters. We describe the procedure of making a differentiable simulator, highlighting the challenges of retaining the physics quality of the standard, non-differentiable version while providing meaningful gradient information. We further discuss the advantages and drawbacks of using our differentiable simulator for calibration.Finally, we discuss extensions to our approach, including applications of the differentiable simulator to physics analysis pipelines. |
Pierre Granger 🔗 |
-
|
Learning Hard Distributions with Quantum-enhanced Variational Autoencoders
(
Poster
)
>
Generative learning is an important task in classical machine learning with several models including generative adversarial networks (GANs) and variational autoencoders (VAEs) which are popular. In quantum machine learning an important task in this setting is that of modeling the distributions obtained by measuring quantum mechanical systems. Classical generative algorithms, including GANs and VAEs, can model the distributions of product states with high fidelity, but fail or require an exponential number of parameters to model entangled states. In this paper, we introduce a quantum-enhanced VAE (QeVAE), a generative quantum-classical hybrid model that uses quantum correlations to improve the fidelity over classical VAEs, while requiring only a linear number of parameters. We provide a closed form expression for the output distributions of the QeVAE. We also empirically show that the QeVAE outperforms classical models on several classes of quantum states, such as 4-qubit and 8-qubit quantum circuit states, haar random states, and quantum kicked rotor states, with a more than 2x increase in fidelity for some states. Finally, we find that the trained model outperforms the classical model when executed on the IBMq Manila quantum computer. As an application we show that our techniques can be also used for the task of circuit compilation. Our work paves the way for new applications of quantum generative learning algorithms and characterizing measurement distributions of high-dimensional quantum states. |
Anantha Rao · Dhiraj Madan · Anupama Ray · Dhinakaran Vinayagamurthy · M S Santhanam 🔗 |
-
|
Revealing the Mechanism of Large-scale Gradient Systems Using a Neural Reduced Potential
(
Poster
)
>
Constructing the reduction model of large-scale pattern dynamics is challenging. In this study, a framework is proposed to estimate a reduction model of the gradient system, often observed in various pattern dynamics, in a data-driven manner using a deep learning model inspired by Hamiltonian neural networks for video. Furthermore, the proposed framework verifies whether the reduction model is consistent with the phenomenon and contains useful properties. To demonstrate its usefulness, it is applied to the numerical calculation data of magnetic domain pattern formation. Consequently, the previous reduction model proposed for magnetic domain pattern dynamics is found to be insufficient to explain the phenomenon, and suggestions for possible directions for the improvement of the reduction model are provided. |
Shunya Tsuji · Ryo Murakami · Hayaru Shouno · Yoh-ichi Mototake 🔗 |
-
|
Physical Symbolic Optimization
(
Poster
)
>
link
We present a framework for constraining the automatic sequential generation of equations to obey the rules of dimensional analysis by construction. Combining this approach with reinforcement learning, we built $\Phi$-SO, a Physical Symbolic Optimization method for recovering analytical functions from physical data leveraging units constraints. Our symbolic regression algorithm achieves state-of-the-art results in contexts in which variables and constants have known physical units, outperforming all other methods on SRBench's Feynman benchmark in the presence of noise (exceeding 0.1%) and showing resilience even in the presence of significant (10%) levels of noise.
|
Wassim Tenachi · Rodrigo Ibata · Foivos Diakogiannis 🔗 |
-
|
Score-based Data Assimilation for a Two-Layer Quasi-Geostrophic Model
(
Poster
)
>
Data assimilation addresses the problem of identifying plausible state trajectories of dynamical systems given noisy or incomplete observations. In geosciences, it presents challenges due to the high-dimensionality of geophysical dynamical systems, often exceeding millions of dimensions. This work assesses the scalability of score-based data assimilation (SDA), a novel data assimilation method, in the context of such systems. We propose modifications to the score network architecture aimed at significantly reducing memory consumption and execution time. We demonstrate promising results for a two-layer quasi-geostrophic model. |
François Rozet · Gilles Louppe 🔗 |
-
|
Physics-Informed Tensor Basis Neural Network for Turbulence Closure Modeling
(
Poster
)
>
Despite the increasing availability of high-performance computational resources, Reynolds-Averaged Navier-Stokes (RANS) simulations remain the workhorse for the analysis of turbulent flows in real-world applications. Linear eddy viscosity models (LEVM), the most commonly employed model type, cannot accurately predict complex states of turbulence. This work combines a deep-neural-network-based, nonlinear eddy viscosity model with turbulence realizability constraints as an inductive bias in order to yield improved predictions of the anisotropy tensor. Using visualizations based on the barycentric map, we show that the proposed machine learning method's anisotropy tensor predictions offer a significant improvement over all LEVMs in traditionally challenging cases with surface curvature and flow separation. However, this improved anisotropy tensor does not, in general, yield improved mean-velocity and pressure field predictions in comparison with the best-performing LEVM. |
Leon Riccius · Atul Agrawal · Phaedon S Koutsourelakis 🔗 |
-
|
Relating Generalization in Deep Neural Networks to Sensitivity of Discrete Dynamical Systems
(
Poster
)
>
The ability of deep neural networks to generalize over a large diversity of function modeling tasks, remains a key mystery of the field. While the workings of the network are fully known, it remains unclear which specific properties are necessary and/or sufficient for the observed generalization.In this paper, we approach the characterization of this generalization by studying the ability to learn the evolution of discrete dynamical systems. Our findings reveal a strong correlation between the number of examples needed for generalization and the sensitivity of the dynamical systems to perturbations of the initial state. |
Jan Disselhoff · Michael Wand 🔗 |
-
|
Orbital-Free Density Functional Theory with Continuous Normalizing Flows
(
Poster
)
>
Orbital-free density functional theory (OF-DFT) provides an alternative approach for calculating the molecular electronic energy, relying solely on the electron density. In OF-DFT, both the ground-state density is optimized variationally to minimize the total energy functional while satisfying the normalization constraint. In this work, we introduce a novel approach by parameterizing the electronic density with a normalizing flow ansatz, which is also optimized by minimizing the total energy functional. Our model successfully replicates the electronic density for a diverse range of chemical systems, including a one-dimensional diatomic molecule, specifically Lithium hydride with varying interatomic distances, as well as comprehensive simulations of hydrogen and water molecules, all conducted in Cartesian space. |
Rodrigo Vargas-Hernandez · Ricky T. Q. Chen · Alexandre de Camargo 🔗 |
-
|
DeepTreeGANv2: Iterative Pooling of Point Clouds
(
Poster
)
>
In High Energy Physics, detailed and time-consuming simulations are used for particle interactions with detectors. To bypass these simulations with a generative model, the generation of large point clouds in a short time is required, while the complex dependencies between the particles must be correctly modelled. Particle showers are inherently tree-based processes, as each particle is produced by the decay or detector interaction of a particle of the previous generation.In this work, we present a significant extension to DeepTreeGAN, featuring a critic, that is able to aggregate such point clouds iteratively in a tree-based manner. We show that this model can reproduce complex distributions, and we evaluate its performance on the public JetNet 150 dataset. |
Moritz A.W. Scham · Dirk Krücker · Kerstin Borras 🔗 |
-
|
Robust Ocean Subgrid-Scale Parameterizations Using Fourier Neural Operators
(
Poster
)
>
In climate simulations, small-scale processes shape ocean dynamics but remain computationally expensive to resolve directly. For this reason, their contributions are commonly approximated using empirical parameterizations, which lead to significant errors in long-term projections. In this work, we develop parameterizations based on Fourier Neural Operators, showcasing their accuracy and generalizability in comparison to other approaches. Finally, we discuss the potential and limitations of neural networks operating in the frequency domain, paving the way for future investigation. |
Victor Mangeleer · Gilles Louppe 🔗 |
-
|
3D Localization of Microparticles from Holographic Images using Neural Networks
(
Poster
)
>
Digital in-line holography is a versatile and reliable imaging technique for characterizing the size and spatial distribution of particles in flows. It maps 3D particle snapshots into a 2D image -- a hologram. The analysis bottleneck of in-line holographic imaging is the immense cost of computation and manual effort that must be invested. Here we propose a learning-based approach to extract the size and 3D spatial distribution of objects from holograms. We outperform the standard propagation based method in terms of detection rate, precision and processing time on a small sensor aperture and at $1/4$ of the original resolution. Our method performs with an F1 score of 0.91 in suspensions with more than 60 particles$/$\si{\centi m}$^3$, which is an $18$ percent performance boost in comparison to propagation-based software commonly used in practice. At the same time, our method is also several orders of magnitude faster, eliminating the computational bottleneck.
|
Ayush Paliwal · Oliver Schlenczek · Birte Thiede · Gholamhossein Bagheri · Alexander Ecker 🔗 |
-
|
Locating Hidden Exoplanets Using Machine Learning
(
Poster
)
>
Exoplanets in protoplanetary disks cause localized deviations from Keplerian velocity in molecular line emission. Current methods of characterizing these deviations are slow and prone to false negatives. We demonstrate that machine learning can quickly and accurately detect planets. We train computer vision models on synthetic images of protoplanetary disks generated from simulations and apply these models to real observations. The models recreate previous discoveries, accurately locating known planets. A new exoplanet in the disk HD 142666 is identified. |
Jason Terry · Sergei Gleyzer 🔗 |
-
|
Learning Optical Map in Liquid Xenon Detector with Poisson Likelihood Loss
(
Poster
)
>
Dual-phase liquid xenon time projection chambers (LXeTPC) have been successfully applied in rare event searches in astroparticle physics because of their ability to reach low backgrounds and detect small scintillation signals with photosensors. Accurate modeling of optical properties is essential for reconstructing particle interactions within these detectors as well as for developing data selection criteria. This is commonly achieved with discretized maps derived from Monte Carlo simulation or approximated with empirical analytical models. In this work, we employ a novel approach to this using a neural network trained with a Poisson log-likelihood ratio loss to model the mapping from light source location to the expected light intensity for each photosensor. We demonstrate its effectiveness by integrating it into a likelihood fitter for position reconstruction, simultaneously providing insights into the uncertainty associated with the reconstructed position. |
Shixiao Liang · Christopher Tunnell 🔗 |
-
|
AstroYOLO: Learning Astronomy Multi-Tasks in a Single Unified Real-Time Framework
(
Poster
)
>
In this paper, we proposed a single unified real-time pipeline that jointly performs two tasks: star vs. galaxy detection and smooth type vs. disk type galaxy classification. To achieve the goal, we introduced a model which have two different classification heads sharing the same backbone: The first classification head is used to detect useful objects from the universe images and classify them into star vs. galaxy; while the second classification head is used to further classify whether the galaxy is smooth or disk type. As the backbone, we used YOLOX architecture, add two classification heads upon it and train them using two heterogeneous datasets: the star vs. galaxy detection dataset which have images including star and galaxy objects and corresponding bounding box and class labels and the smooth vs. disk type classification dataset having galaxy images and their corresponding labels. To prevent the catastrophic forgetting when learning two heads and a backbone, we performed the alternative training between two tasks and also applied data augmentation such as mosaic and mix-up methods. The final model achieved 73.4\% accuracy on the smooth vs. disk type classification task, and 65.6 mAP score on star vs. galaxy detection task. |
Nodirkhuja Khujaev · Roman Tsoy · Seungryul Baek 🔗 |
-
|
Coarse graining systems on inhomogeneous graphs using contrastive learning
(
Poster
)
>
Understanding and characterizing the emergent behavior of systems with numerous interacting components is typically difficult. This is especially the case when these interactions occur on an inhomogeneous graph, a situation relevant to many systems in bio- and statistical physics. Here we showcase a data driven approach, aimed at optimally compressing the system's information based on an information-theoretic principle. We develop an efficient numerical algorithm applicable to systems on arbitrary static graphs which employs variational estimators of mutual information to find optimal compression. We demonstrate that the optimal compression maps interpretably extract physically relevant local degrees of freedom. This enables us to construct an effective theory of a strongly correlated system on a quasicrystal. |
Doruk Efe Gökmen · Maciej Koch-Janusz · Zohar Ringel · Sebastian Huber · Felix Flicker · Sounak Biswas 🔗 |
-
|
Understanding Pathologies of Deep Heteroskedastic Regression
(
Poster
)
>
Recent studies have reported negative results when using heteroskedastic neural regression models. In particular, for overparameterized models, the mean and variance networks are powerful enough to either fit every single data point, or to learn a constant prediction with an output variance exactly matching every predicted residual, explaining the targets as pure noise. We study these difficulties from the perspective of statistical physics and show that the observed instabilities are not specific to any neural network architecture but are already present in a field theory of an overparameterized conditional Gaussian likelihood model. Under light assumptions, we derive a nonparametric free energy that can be solved numerically. The resulting solutions show excellent qualitative agreement with empirical model fits on real-world data and, in particular, prove the existence of phase transitions, i.e., abrupt, qualitative differences in the behaviors of the regressors upon varying the regularization strengths on the two networks. Our work provides a theoretical explanation for the necessity to carefully regularize heteroskedastic regression models. Moreover, the insights from our theory suggest a scheme for optimizing this regularization which is quadratically more efficient than the naive approach. |
Eliot Wong-Toi · Alex Boyd · Vincent Fortuin · Stephan Mandt 🔗 |
-
|
Advancing Generative Modelling of Calorimeter Showers on Three Frontiers
(
Poster
)
>
Generative machine learning can be used to augment and speed-up traditional physics simulations, i.e. the simulation of elementary particles in the detector of collider experiments. Like many physics data, these calorimeter showers can either be represented as images or as permutation-invariant lists of measurements, i.e. as point clouds. We advance the generative models for calorimeter showers on three frontiers: (1) increasing the number of conditional features for precise energy- and angle-wise generation with the bounded bottleneck auto-encoder (BIB-AE), (2) improving generation fidelity using a normalizing flow model, dubbed "Layer-to-Layer-Flows" (L2LFlows), (3) developing a diffusion model for geometry-independent calorimeter point cloud scalable to O(1000) points, called CaloClouds, and distilling it into a consistency model for fast single-shot sampling. |
Erik Buhmann · Sascha Diefenbacher · Engin Eren · Frank Gaede · Gregor Kasieczka · William Korcari · Anatolii Korol · Claudius Krause · Katja Krueger · Peter McKeown · Imahn Shekhzadeh · David Shih
|
-
|
Multi-fidelity Constrained Optimization for Stochastic Black Box Simulators
(
Poster
)
>
Constrained optimization of the parameters of a simulator plays a crucial role in a design process. These problems become challenging when the simulator is stochastic, computationally expensive, and the parameter space is high-dimensional. One can efficiently perform optimization only by utilizing the gradient with respect to the parameters, but these gradients are unavailable in many legacy, black-box codes. We introduce the algorithm Scout-Nd (Stochastic Constrained Optimization for N dimensions) to tackle the issues mentioned earlier by efficiently estimating the gradient, reducing the noise of the gradient estimator, and applying multi-fidelity schemes to further reduce computational effort. We validate our approach on standard benchmarks, demonstrating its effectiveness in optimizing parameters highlighting better performance compared to existing methods. |
Kislaya Ravi · Atul Agrawal · Phaedon S Koutsourelakis · Hans-Joachim Bungartz 🔗 |
-
|
Activation Functions in Non-Negative Neural Networks
(
Poster
)
>
Optical neural networks (ONNs) have the potential to overcome scaling limitations of transistor-based systems due to their inherent low latency and large available bandwidth. However, encoding the information directly in the physical properties of light fields also imposes new computational constraints, for example the restriction to only positive intensity values for incoherent photonic processors. In this work, we address design and training challenges of physically constrained information processing with a particular focus on activation functions in non-negative neural networks (4Ns). Building on biological inspirations we revisit the concept of inhibitory (decreasing) and excitatory (increasing) activation functions, explore their effects experimentally and introduce a general approach for weight initialization of non-negative neural networks. Our results indicate the importance of both excitatory and inhibitory elements in activation functions in incoherent ONNs which should be considered for future design of optical activation functions for ONNs. Code is available at https://XXXXXXXX. |
Marlon Becker · Dominik Drees · Frank Brückerhoff-Plückelmann · Carsten Schuck · Wolfram Pernice · Benjamin Risse 🔗 |
-
|
Tree-Based Algorithms for Weakly Supervised Anomaly Detection
(
Poster
)
>
Particle physics searches that rely on a specific signal model have so far failed to find evidence for physics beyond the Standard Model. Model-agnostic methods provide an important alternative approach, as they can analyze large amounts of data for a wide range of potential anomalies. Many state-of-the-art anomaly detection algorithms are based on a weakly supervised classification task, where the data samples are distinguished from samples of a background template. A key challenge for such algorithms is their performance degradation in the presence of uninformative features, which introduces model dependence by requiring feature selection. In this work, we propose the use of tree-based algorithms in weakly supervised anomaly detection with tabular data, as they are not only significantly faster to train and evaluate than deep learning--based methods, but are also robust to uninformative features and achieve better performance. |
Thorben Finke · Marie Hein · Gregor Kasieczka · Michael Krämer · Alexander Mück · Parada Prangchaikul · Tobias Quadfasel · David Shih · Manuel Sommerhalder 🔗 |
-
|
HIDM: Emulating Large Scale HI Maps using Score-based Diffusion Models
(
Poster
)
>
Efficiently analyzing maps from upcoming large-scale surveys requires gaining direct access to a high-dimensional likelihood and generating large-scale fields with high fidelity, which both represent major challenges. Using CAMELS simulations, we employ the state-of-the-art score-based diffusion models to simultaneously achieve both tasks. We show that our model, HIDM, is able to efficiently generate high fidelity large scale HI maps that are in a good agreement with the CAMELS's power spectrum, probability distribution, and likelihood up to second moments. HIDM represents a step forward towards maximizing the scientific return of future large scale surveys. |
Sultan Hassan · Sambatra Andrianomena 🔗 |
-
|
Probabilistic-Machine-Learning-based Turbulence Model Learning with a Differentiable Solver
(
Poster
)
>
We present a novel, data-driven closure model for Reynolds-averaged Navier-Stokes(RANS) equations which consists of two-parts. A parametric one, which a tensorbasis neural-network and a non-parametric one which makes use of latent, randomvariables in order to capture aleatoric model uncertainty. Our fully Bayesianformulation, incorporating sparsity-inducing priors, identifies areas of the problemdomain where the parametric closure falls short, requiring stochastic corrections tothe Reynolds stress tensor. Training employs sparse, indirect data such as meanvelocities and pressures, in contrast to the majority of alternatives which requiredirect, Reynolds stress data. For inference and learning, we employ StochasticVariational Inference, facilitated by an adjoint-based differentiable solver. Thisend-to-end differentiable framework can ultimately yield accurate, probabilisticpredictions for flow quantities, even in regions with model errors, as exemplifiedby the backward-facing step benchmark problem. |
Atul Agrawal · Phaedon S Koutsourelakis 🔗 |
-
|
AI ensemble for signal detection of higher order gravitational wave modes of quasi-circular, spinning, non-precessing binary black hole mergers
(
Poster
)
>
We introduce spatiotemporal-graph models that concurrently process data from the twin advanced LIGO detectors and the advanced Virgo detector. We trained these AI classifiers with 2.4 million \texttt{IMRPhenomXPHM} waveforms that describe quasi-circular, spinning, non-precessing binary black hole mergers with component masses (m{{1,2}}\in[3M\odot, 50 M\odot]), and individual spins (s^z{{1,2}}\in[-0.9, 0.9]); and which include the ((\ell, |m|) = {(2, 2), (2, 1), (3, 3), (3, 2), (4, 4))} modes, and mode mixing effects in the (\ell = 3, |m| = 2) harmonics. We trained these AI classifiers within 22 hours using distributed training over 96 NVIDIA V100 GPUs in the Summit supercomputer. We then used transfer learning to create AI predictors that estimate the total mass of potential binary black holes identified by all AI classifiers in the ensemble. We used this ensemble, 3 AI classifiers and 2 predictors, to process a year-long test set in which we injected 300,000 signals. This year-long test set was processed within 5.19 minutes using 1024 NVIDIA A100 GPUs in the Polaris supercomputer (for AI inference) and 128 CPU nodes in the ThetaKNL supercomputer (for post-processing of noise triggers), housed at the Argonne Leadership Supercomputing Facility. These studies indicate that our AI ensemble provides state-of-the-art signal detection accuracy, and reports 2 misclassifications for every year of searched data. This is the first AI ensemble designed to search for and find higher order gravitational wave mode signals. |
Minyang Tian · Eliu Huerta 🔗 |
-
|
Latent space representations of cosmological fields
(
Poster
)
>
We investigate the possibility of learning the representations of cosmological multifield dataset from the CAMELS project. We train a very deep variational encoder on images which comprise three channels, namely gas density (Mgas), neutral hydrogen density (HI), and magnetic field amplitudes (B). The clustering of the images in feature space with respect to some cosmological/astrophysical parameters (e.g. $\Omega_{\rm m}$) suggests that the generative model has learned latent space representations of the high dimensional inputs. We assess the quality of the latent codes by conducting a linear test on the extracted features, and find that a single dense layer is capable of recovering some of the parameters to a promising level of accuracy, especially the matter density whose prediction corresponds to a coefficient of determination $R^{2}$ = 0.93. Furthermore, results show that the generative model is able to produce images that exhibit statistical properties which are consistent with those of the training data, down to scales of $k\sim 4h/{\rm Mpc}.$
|
Sambatra Andrianomena · Sultan Hassan 🔗 |
-
|
Enhancing Data-Assimilation in CFD using Graph Neural Networks
(
Poster
)
>
We introduce a novel machine learning-based approach for data assimilation applied in the context of fluid mechanics. We consider as baseline the Reynolds Averaged Navier-Stokes (RANS) equations, a set of equations where the unknown is represented by the meanflow and a closure model based on the Reynolds-stress tensor is required for correctly computing the solution. We consider an adjoint-based optimization method augmented by the introduction of Graph Neural Networks (GNNs). To this end, we first train a model for the closure term based on a GNN. Second, the GNN model is introduced in the end-to-end training process of data assimilation, where the RANS equations are part of the architecture and act as a constraint for a physically consistent prediction. We show our results using direct numerical simulations based on a Finite Element Method (FEM) solver; a two-fold interface between the GNN model and the solver allows the GNN’s predictions to be incorporated into post-processing steps of the FEM analysis. |
Michele Quattromini · Michele Alessandro Bucci · Stefania Cherubini · Onofrio Semeraro 🔗 |
-
|
Gamma Ray AGNs: Estimating Redshifts and Blazar Classification using traditional Neural Networks with smart initialization and self-supervised learning
(
Poster
)
>
Redshift estimation and the classification of gamma-ray AGNs represent crucial challenges in the field of gamma-ray astronomy. Recent efforts have been made to tackle these problems using traditional machine learning methods. However, the simplicity of existing algorithms, combined with their basic implementations, underscores an opportunity and a need for further advancement in this area. Our approach begins by implementing a Bayesian model for redshift estimation, which can account for uncertainty while providing predictions with the desired confidence level. Subsequently, we address the classification problem by leveraging intelligent initialization techniques and employing soft voting. Additionally, we explore several potential self-supervised algorithms in their conventional form. Lastly, in addition to generating predictions for data with missing outputs, we ensure that the theoretical assertions put forth by both algorithms mutually reinforce each other. |
Sarvesh Gharat · Gopal Bhatta · ABHIMANYU BORTHAKUR 🔗 |
-
|
Enhancing the local expressivity of geometric graph neural networks
(
Poster
)
>
A central operation in geometric graph neural networks (GNNs) is the equivariant pairwise embedding, which encodes the local environment of each node as a learned representation. In this work, we examine the role of the pairwise embedding and consider a series of generalizations of its functional form beyond previous work. The new embeddings that we design considerably advance the state of the art in challenging distributions: as a highlight, when applied as an interatomic potential, we achieve a 29% relative reduction of force errors on diverse allotropes of lithium-intercalated carbon with a 4-fold reduction in parameter count. Furthermore, we demonstrate improved transferrability in molecular datasets by varying the locality of the network according to the depth of the representation. |
Sam Walton Norwood · Lars L Schaaf · Ilyes Batatia · Arghya Bhowmik · Gabor Csanyi 🔗 |
-
|
Domain Adaptation for Measurements of Strong Gravitational Lenses
(
Poster
)
>
Upcoming surveys are predicted to discover galaxy-scale strong lenses on the order of 10^5, making deep learning methods necessary in lensing data analysis. Currently, there is insufficient real lensing data to train deep learning algorithms, but the alternative of training only on simulated data results in poor performance on real data. Domain Adaptation may be able to bridge the gap between simulated and real datasets. We utilize domain adaptation for the estimation of Einstein radius (Θ_E) in simulated galaxy-scale gravitational lensing images with different levels of observational realism. We evaluate two domain adaptation techniques - Domain Adversarial Neural Networks (DANN) and Maximum Mean Discrepancy (MMD). We train on a source domain of simulated lenses and apply it to a target domain of lenses simulated to emulate noise conditions in the Dark Energy Survey (DES). We show that both domain adaptation techniques can significantly improve the model performance on the more complex target domain dataset. This work is the first application of domain adaptation for a regression task in strong lensing imaging analysis. Our results show the potential of using domain adaptation to perform analysis of future survey data with a deep neural network trained on simulated data. |
Paxson Swierc · Yifan Zhao · Aleksandra Ciprijanovic · Brian Nord 🔗 |
-
|
QDC: Quantum Diffusion Convolution Kernels on Graphs
(
Poster
)
>
Graph convolutional neural networks (GCNs) operate by aggregating messages over local neighborhoods given the prediction task under interest. Many GCNs can be understood as a form of generalized diffusion of input features on the graph, and significant work has been dedicated to improving predictive accuracy by altering the ways of message passing. In this work, we propose a new convolution kernel that effectively rewires the graph according to the occupation correlations of the vertices by trading on the generalized diffusion paradigm for the propagation of a quantum particle over the graph. We term this new convolution kernel the Quantum Diffusion Convolution (QDC) operator. Through experiments on a range of datasets, we observe that QDC improves predictive performance on the widely used benchmark datasets when compared to similar methods. |
Thomas Markovich 🔗 |
-
|
Efficient and Robust Jet Tagging at the LHC with Knowledge Distillation
(
Poster
)
>
The challenging environment of real-time data processing systems at the Large Hadron Collider (LHC) strictly limits the computational complexity of algorithms that can be deployed. For deep learning models, this implies that only small models that have low capacity and weak inductive bias are feasible. To address this issue, we utilize knowledge distillation to leverage both the performance of large models and the reduced computational complexity of small ones. In this paper, we present an implementation of knowledge distillation, demonstrating an overall boost in the student models' performance for the task of classifying jets at the LHC. Furthermore, by using a teacher model with a strong inductive bias of Lorentz symmetry, we show that we can induce the same inductive bias in the student model which leads to better robustness against arbitrary Lorentz boost. |
Ryan Liu · Abhijith Gandrakota · Jennifer Ngadiuba · jean-roch vlimant · Maria Spiropulu 🔗 |
-
|
Fast Particle-based Anomaly Detection Algorithm with Variational Autoencoder
(
Poster
)
>
Model-agnostic anomaly detection is one of the promising approaches in the search for new beyond the standard model physics. In this paper, we present Set-VAE, a particle-based variational autoencoder (VAE) anomaly detection algorithm. We demonstrate a 2x signal efficiency gain compared with traditional subjettiness-based jet selection. Furthermore, with an eye to the future deployment to trigger systems, we propose the CLIP-VAE, which reduces the inference-time cost of anomaly detection by using the KL-divergence loss as the anomaly score, resulting in a 2x acceleration in latency and reducing the caching requirement. |
Ryan Liu · Abhijith Gandrakota · Jennifer Ngadiuba · jean-roch vlimant · Maria Spiropulu 🔗 |
-
|
Preparing Spectral Data for Machine Learning: A Study of Geological Classification from Aerial Surveys
(
Poster
)
>
This study focuses on improving the preparation of spectral data for machine learning. It does so by conducting a case study that involves matching an airborne gamma-ray spectral survey of the San Francisco Bay area to geological classifications provided by the United States Geological Survey (Graymer et al., 2006).Our investigation has revealed three key approaches for enhancing accuracy in this task:1) eliminating extraneous data segments unrelated to the main task,2) augmenting minority classes to improve class balances,and 3) merging inconsistent classes.By incorporating these methods, we were able to achieve a significant increase in classification accuracy. Specifically, we increased the accuracy from an initial 40.8% to approximately 72.7%. We plan to continue our work to further enhance performance, with the goal of extending the applicability of these methods to other data types and tasks. One potential future application is the detection of rare earth elements from aerial surveys. |
Jun Woo Chung · Alex Sim · Brian Quiter · Yuxin Wu · Weijie Zhao · Kesheng Wu 🔗 |
-
|
Loss-driven sampling within hard-to-learn areas for simulation-based neural network training
(
Poster
)
>
link
This paper focuses on active learning methods for training neural networks from synthetic input samples that can be generated on demand. This includes Physics Informed Neural Networks (PINNs), simulation-based inference, deep surrogates and deep reinforcement learning. An adaptive process observes the training progress and steers the data generation with the goal of speeding up and increasing the quality of training. We propose a novel adaptive sampling method that concentrates samples close to the areas showing high loss values. Compared to the state-of-the-art R3 sampling our algorithm converges to a validation loss of 0.5 in 6000 iterations, while it takes 25000 iterations to reach a loss of 0.7 for the R3 algorithm when training a PINN with the Allen Cahn equation. |
Sofya Dymchenko · Bruno Raffin 🔗 |
-
|
Long Time Series Data Release from Broadband Axion Dark Matter Experiment
(
Poster
)
>
Axions are a promising dark matter candidate, yet their feeble interactions with visible matter pose a considerable challenge in detecting them. The ABRACADABRA-10cm experiment was meticulously built to generate long time series data within which axion signals could hide. Currently, the axion analysis of this dataset is only conducted in the frequency domain, omitting the valuable phase information embedded in the raw time series. In this public data release, we present a labeled dataset comprised of time series data collected from the ABRACADABRA detector, complete with axion-mimicking hardware signal injections. This dataset paper sets the stage for critical challenges faced in ABRACADABRA data analysis, including peak finding, denoising, and time series reconstructions. The success of machine learning algorithms in tackling these challenges will boost the experimental sensitivity to the enigmatic axion. |
Jessica Fry · Aobo Li 🔗 |
-
|
Physics - Informed Machine Learning for Reduced Space Chemical Kinetics
(
Poster
)
>
Modeling detailed chemical kinetics stands as a primary challenge in combustion simulations. Recent machine learning (ML) approaches aim to accelerate chemical kinetics integration, though their application is often limited to simpler reaction mechanisms. This study presents a novel framework to enforce physical constraints, specifically total mass and elemental conservation, into the training of ML models for reduced-space chemical kinetics of large and complex reaction mechanisms. Given the strong correlation between full and reduced solution vectors, our method utilizes a small neural network to establish an accurate and physically consistent mapping. By leveraging this mapping, we enforce physical constraints in the training process of the ML model for reduced space chemical kinetics. The framework is demonstrated here with the chemical kinetics of CH4 oxidation. The resulting solution vectors from our Deep Operator Networks-based approach are not only accurate but also align more consistently with physical laws. |
Anuj Kumar · Tarek Echekki 🔗 |
-
|
Smartpixels: Towards on-sensor inference of charged particle track parameters and uncertainties
(
Poster
)
>
The combinatorics of track seeding has long been a computational bottleneck for triggering and offline computing in High Energy Physics (HEP), and remains so for the HL-LHC. Next-generation pixel sensors will be sufficiently fine-grained to determine angular information of the charged particle passing through from pixel-cluster properties. This detector technology immediately improves the situation for offline tracking, but any major improvements in physics reach are unrealized since they are dominated by level-one trigger acceptance. We will demonstrate track angle and hit position prediction, including errors, using a mixture density network within a single layer of silicon as well as the progress towards and status of implementing the neural network in hardware on both FPGAs and ASICs. |
Lindsey Gray · Jennet Dickinson · Rachel Kovach-Fuentes · Morris Swartz · Giuseppe Di Guglielmo · Alice Bean · Douglas Berry · Manuel Blanco Valentin · Karri DiPetrillo · Farah Fahim · Jim Hirschauer · Shruti Kulkarni · Ron Lipton · Petar Maksimovic · Corrinne Mills · Mark Neubauer · Benjamin Parpillon · Gauri Pradhan · Chinar Syal · Nhan Tran · Jieun Yoo · Aaron Young
|
-
|
On Representations of Mean-Field Variational Inference
(
Poster
)
>
We obtain representations of Mean-Field Variational Inference (MFVI) in the forms of gradient flows on product spaces, quasilinear partial differential equations and McKean-Vlasov diffusion processes. These new interpretations not only provide new understanding of MFVI, but also allows us to conduct new analysis on its convergence. |
Soumyadip Ghosh · Yingdong Lu · Tomasz Nowicki · Edith Zhang 🔗 |
-
|
Active learning meets fractal decision boundaries: a cautionary tale from the Sitnikov three body problem
(
Poster
)
>
Predicting the evolution of chaotic systems is notoriously hard. These systems are ubiquitous in astronomy, the most well known being the gravitational N-body problem. There has been increasing efforts to develop machine learning (ML) methods to predict the evolution of such systems, with the goal of speeding up simulations. In these setting, Active Learning (AL) is often used to improve the performance of models. Here we use the Sitnikov three-body problem, the simplest case of N-body problem that is capable of chaotic behavior, to illustrate that AL may fail to improve training performance in these conditions, likely due to the fractal nature of the decision boundary. This is an important result for astronomers planning to optimize large sets of N-body simulations via AL in concrete applications, such as e.g. the simulation of star clusters with the goal of constraining gravitational wave emission and for applications in other fields involving chaotic systems. |
Nicolas Payot · Mario Pasquato · Alessandro Alberto Trani · Yashar Hezaveh · Laurence Perreault-Levasseur 🔗 |
-
|
A deep learning framework for jointly extracting spectra and source-count distributions of count maps
(
Poster
)
>
Gamma-ray telescopes measure the direction and energy of incoming photons, resulting in photon-count maps that contain both spatial and spectral information. A major goal when analyzing such data is to determine source-count distributions (SCDs), which characterize the brightness of point-sources too faint to be detected individually. Existing statistical and machine learning methods for this task exist; however, they typically neglect the photon energy. We present a deep learning framework able to jointly reconstruct the spectra of different emission components and the SCDs of point-source populations.In a proof-of-concept example, we show that our method accurately extracts even complex-shaped spectra and SCDs from simulated maps. |
Florian Wolf · Florian List · Nicholas Rodd · Oliver Hahn 🔗 |
-
|
Machine learning-based compression of quantum many body physics: PCA and autoencoder representation of the vertex function
(
Poster
)
>
The vertex function, a continuous function of three momenta describing particle-particle scattering that is typically obtained by sophisticated calculations, plays a central role in the Feynman diagram approach to quantum many-body physics. Here, we use Principal Component Analysis (PCA) and a deep convolutional autoencoder to derive compact, low-dimensional representations of the vertex functions derived using the functional renormalization group for the two dimensional Hubbard model, a paradigmatic theoretical model of interacting electrons on a lattice. Both methodologies successfully reduced the dimensionality to a mere few dimensions while preserving accuracy. PCA demonstrated superior performance in dimensionality reduction compared to the autoencoder. The results suggest the presence of a fundamental underlying structure in the vertex function and suggest paths to dramatically reducing the computational complexity of quantum many-body calculations. |
Jiawei Zang · Matija Medvidović · Dominik Kiese · Domenico Di Sante · Anirvan Sengupta · Andy Millis 🔗 |
-
|
Domain Adaptive Graph Neural Networks for Constraining Cosmological Parameters Across Multiple Data Sets
(
Poster
)
>
Deep learning models have been shown to outperform methods that rely on summary statistics, like the power spectrum, in extracting information from complex cosmological data sets. However, due to differences in the subgrid physics implementation and numerical approximations across different simulation suites, models trained on data from one cosmological simulation show a drop in performance when tested on another. Similarly, models trained on any of the simulations would also likely experience a drop in performance when applied to observational data. Training on data from two different suites of the CAMELS hydrodynamic cosmological simulations, we examine the generalization capabilities of Domain Adaptive Graph Neural Networks (DA-GNNs). By utilizing GNNs, we capitalize on their capacity to capture structured scale-free cosmological information from galaxy distributions. Moreover, by including unsupervised domain adaptation via Maximum Mean Discrepancy (MMD), we enable our models to extract domain-invariant features. We demonstrate that DA-GNN achieves higher accuracy and robustness on cross-dataset tasks (up to 28% better relative error and up to almost an order of magnitude better χ²). Using data visualizations, we show the effects of domain adaptation on proper latent space data alignment. This shows that DA-GNNs are a promising method for extracting domain-independent cosmological information, a vital step toward robust deep learning for real cosmic survey data. |
Andrea Roncoli · Aleksandra Ciprijanovic · M Voetberg · Francisco Villaescusa · Brian Nord 🔗 |
-
|
Multibasis Encodings in Recurrent Neural Network Wave Functions for Variational Optimization
(
Poster
)
>
Solving optimization problems via neural networks has proven to be a promising approach in yielding better solutions. However, the full potential of parameterized models has yet to be fully explored. Motivated by the success of variational quantum optimization with multibasis encodings, we propose a quantum-inspired machine learning algorithm that integrates both approaches to reduce the system size as well as parameters in a neural network ansatz. We demonstrate the performance of the proposed algorithm by solving Ising chain systems, resulting in faster convergence towards the ground state energy. This study holds the potential for widespread applications across various fields that require efficient optimization for large-scale problems. |
Wirawat Kokaew 🔗 |
-
|
Simulation Based Inference of BNS Kilonova Properties: A Case Study with AT2017gfo
(
Poster
)
>
Kilonovae are a class of astronomical transients observed as counterparts to mergers of compact binary systems, such as a binary neutron star (BNS) or black hole-neutron star (BHNS) inspirals. They serve as probes for heavy-element nucleosynthesis in astrophysical environments, while together with gravitational wave emission constraining the distance to the merger itself, they can place constraints on the Hubble constant. Obtaining the physical parameters (e.g. ejecta mass, velocity, composition) of a kilonova from observations is a complex inverse problem, usually tackled by sampling-based inference methods such as Markov-chain Monte Carlo (MCMC) or nested sampling techniques. These methods often rely on computing approximate likelihoods, since a full simulation of compact object mergers involve expensive computations such as integrals, the calculation of likelihood of the observed data given parameters can become intractable, rendering the likelihood-based inference approaches inapplicable. We propose here to use Simulation-based Inference (SBI) techniques to infer the physical parameters of BNS kilonovae from their spectra, using simulations produced with KilonovaNet. Our model uses Sequential Neural Posterior Estimation (SNPE) together with an embedding neural network to accurately predict posterior distributions from simulated spectra. We further test our model with real observations from AT2017gfo, the only kilonova with multi-messenger data, and show that our estimates agree with previous likelihood-based approaches. |
Phelipe Darc · Clecio Bom · Bernardo Fraga · Charles D. Kilpatrick 🔗 |
-
|
Physics-aware Modeling of an Accelerated Particle Cloud
(
Poster
)
>
Particle accelerator simulators, pivotal for acceleration optimization, are constrained by computational time.While current machine learning surrogate models for particle accelerator simulators offer efficiency, they lack in-depth positional data for individual particles.Drawing from 3D computer vision, we propose adapting the point cloud deep learning methods, adept at processing 3D point sets, to model particle beams. |
Emmanuel Goutierre · Hayg Guler · Christelle Bruni · Johanne Cohen · Michele Sebag 🔗 |
-
|
Optimized Dry Cooling for Solar Power Plants
(
Poster
)
>
Concentrated solar power (CSP) plants offer sustainable energy with the benefit of day-to-night energy storage. The recent development of the supercritical carbon dioxide (sCO2) Brayton cycle made CSP plants cost-competitive. However, the cost of cooling required for these CSP plants can vary wildly depending on design, and current cooler designs are far from optimal. Here, we optimize the design and configuration of a dry cooling system. We develop a physics-based simulation of the cooling properties of an air-cooled heat exchanger. Using this simulator, we leverage recent results in high-dimensional Bayesian optimization to find dry cooler designs that minimize lifetime cost, reducing this cost by about 67% compared to recently proposed designs. Our simulation and optimization framework can increase the development pace of economically viable sustainable energy generation systems. |
Hansley Narasiah · Ouail Kitouni · Andrea Scorsoglio · Bernd Sturdza · Shawn Hatcher · Dolores Garcia · Matt Kusner 🔗 |
-
|
A Physics-Constrained NeuralODE Approach for Robust Learning of Stiff Chemical Kinetics
(
Poster
)
>
The high computational cost associated with solving for detailed chemistry posesa significant challenge for predictive computational fluid dynamics (CFD) simulations of turbulent reacting flows. These models often require solving a system of coupled stiff ordinary differential equations (ODEs). While deep learning techniques have been experimented with to develop faster surrogate models, they often fail to integrate reliably with CFD solvers. This instability arises because deep learning methods optimize for training error without ensuring compatibility with ODE solvers, leading to accumulation of errors over time. Recently, NeuralODE-based techniques have offered a promising solution by effectively modeling chemical kinetics. In this study, we extend the NeuralODE framework for stiff chemical kinetics by incorporating mass conservation constraints directly into the loss function during training. This ensures that the total mass and the elemental species mass are conserved, a critical requirement for reliable downstream integration with CFD solvers. Our results demonstrate that this enhancement not only improves the physical consistency with respect to mass conservation criteria but also ensures better robustness and makes the training process more computationally efficient. |
Tadbhagya Kumar · Anuj Kumar · Pinaki Pal 🔗 |
-
|
Trick or treat? Evaluating stability strategies in graph network-based simulators
(
Poster
)
>
Particle-based simulators are ubiquitous in science and engineering, but some are expensive both in terms of time and compute. In simulations where local interactions play a major role, graph network-based simulators (GNS) show promise to address these issues due to their ability to model local interactions and relationships through the graph structure. However, their autoregressive nature makes them susceptible to distribution shifts. Numerous strategies, or tricks, have been proposed to address this issue. In this work, we evaluate three of them: adding a random walk to the input, taking the loss of a sequence, and the pushforward trick. We find that these tricks fail to address the underlying problem, even when the dynamics are relatively simple. |
Omer Rochman Sharabi · Gilles Louppe 🔗 |
-
|
Super-Resolution Emulation of Large Cosmological Fields with a 3D Conditional Diffusion Model
(
Poster
)
>
High-resolution (HR) simulations of baryonic matter in cosmology often take millions of CPU hours. On the other hand, low resolution (LR) dark matter simulations of the same comological volume use minimal computing resources. In this paper we train a conditional diffusion model to upgrade LR dark matter simulations probabilistically to HR baryonic matter simulations. Our approach is based on the Palette diffusion model, which we generalize to 3 dimensions. Our superresolution emulator is trained to perform outpainting, and can upgrade arbitrarily large cosmological volumes from LR to HR, using an iterative outpainting procedure. |
Adam Rouhiainen · Michael Gira · Gary Shiu · Kangwook Lee · Moritz Münchmeyer 🔗 |
-
|
Equivariant Networks for Robust Galaxy Morphology Classification
(
Poster
)
>
We propose the use of group convolutional neural network architectures (GCNNs) equivariant to the 2D Euclidean group, $E(2)$, for the task of galaxy morphology classification by utilizing symmetries of the data present in galaxy images as an inductive bias in the architecture. We conduct robustness studies by introducing artificial perturbations via Poisson noise insertion and one-pixel adversarial attacks to simulate the effects of limited observational capabilities. We train, validate, and test GCNNs on the Galaxy10 DECals dataset and find that GCNNs achieve higher classification accuracy and are consistently more robust than their non-equivariant counterparts, with an architecture equivariant to the group $D_{16}$ achieving a $95.52 \pm 0.18\%$ test-set accuracy and losing $<6\%$ accuracy on a 50\%-noise dataset.
|
Sneh Pandya · Purvik Patel · Franc O · Jonathan Blazek 🔗 |
-
|
Reduced-order modeling for parameterized PDEs via implicit neural representations
(
Poster
)
>
We present a new data-driven reduced-order modeling approach to efficiently solve parametrized partial differential equations (PDEs) for many-query problems. This work is inspired by the concept of implicit neural representation (INR), which models physics signals in a continuous manner and independent of spatial/temporal discretization. The proposed framework encodes PDE and utilizes a parametrized neural ODE (PNODE) to learn latent dynamics characterized by multiple PDE parameters. PNODE can be inferred by a hypernetwork to reduce the potential difficulties in learning PNODE due to a complex multilayer perceptron (MLP). The framework uses an INR to decode the latent dynamics and reconstruct accurate PDE solutions. Further, a physics-informed loss is also introduced to correct the prediction of unseen parameter instances. A numerical experiment is performed on a two-dimensional Burgers equation with a large variation of PDE parameters. We evaluate the proposed method at a large Reynolds number and obtain up to speedup of ~50x and ~3% relative l2 error. |
Tianshu Wen · Kookjin Lee · Youngsoo Choi 🔗 |
-
|
Simulation-Based Inference for Detecting Blending in Spectra
(
Poster
)
>
Many galaxies overlap visually from the vantage point of Earth; these galaxies are known as ``blends''. Undetected blends can lead to errors in the estimation of quantities of scientific interest, such as cosmological parameters and redshift. We propose a generative model based on a state-of-the-art simulator of galaxy spectra, and develop a likelihood-free inference method to detect unrecognized blends. Our inference routine simulates both blended and unblended spectra with which it trains an inference network to solve the inverse problem, that is, to map spectra to a Bernoulli distribution indicating the presence or absence of blendedness. Our experiments demonstrate the potential of our method to detect unrecognized blends in high-resolution spectral data from the Dark Energy Spectroscopic Instrument (DESI). |
Declan McNamara · Jeffrey Regier 🔗 |
-
|
JETLOV: Enhancing Jet Tree Tagging through Neural Network Learning of Optimal LundNet Variables
(
Poster
)
>
Machine learning has played a pivotal role in advancing physics, with deep learning notably contributing to solving complex classification problems such as jet tagging in the field of jet physics. In this experiment, we aim to harness the full potential of neural networks while acknowledging that, at times, we may lose sight of the underlying physics governing these models. Nevertheless, we demonstrate that we can achieve remarkable results obscuring physics knowledge and relying completely on the model's outcome. We introduce JetLOV, a composite comprising two models: a straightforward multilayer perceptron (MLP) and the well-established LundNet. Our study reveals that we can attain comparable jet tagging performance without relying on the pre-computed LundNet variables. Instead, we allow the network to autonomously learn an entirely new set of variables, devoid of a priori knowledge of the underlying physics. These findings hold promise, particularly in addressing the issue of model dependence, which can be mitigated through generalization and training on diverse data sets. |
Giorgio Cerro 🔗 |
-
|
Hierarchical Cross-entropy Loss for Classification of Astrophysical Transients
(
Poster
)
>
Astrophysical transient phenomena are traditionally classified spectroscopically in a hierarchical taxonomy; however, this graph structure is currently not utilized in neural net-based photometric classifiers for time-domain astrophysics. Instead, independent classifiers are trained for different tiers of classified data, and events are excluded if they fall outside of these well-defined but flat classification schemes. Here, we introduce a weighted hierarchical cross-entropy objective function for classification of astrophysical transients. Our method allows users to directly build and use physics- or observationally-motivated tree-based taxonomies. Our weighted hierarchical cross-entropy loss directly uses this graph to accurately classify all targets into any node of the tree, re-weighting imbalanced classes. We test our novel loss on a set of variable stars and extragalactic transients from the Zwicky Transient Facility, showing that we can achieve similar performance to fine-tuned classifiers with the advantage of notably more flexibility in downstream classification tasks. |
V Villar 🔗 |
-
|
Surrogate Model Training Data for FIDVR-related Voltage Control in Large-scale Power Grids
(
Poster
)
>
This work presents an effective machine learning (ML) data set related to the short-term voltage dynamics in power systems. Power systems dynamics are highly nonlinear and intricate. Model designs/specifications in power systems need expertise to capture dynamic phenomena. ML has become an important tool for analyzing complex behaviors of physical systems, but ML models need quality data sets for training and testing. Learning surrogate models to replicate certain dynamic behaviors of power systems is a growing area of interest; however, building required data sets can be challenging. We utilize the high performance computing (HPC)-based grid simulator GridPACK to create voltage dynamics of a bulk power system, namely the IEEE 300 bus test system, capturing fault-induced delayed voltage recovery (FIDVR) phenomenon. This FIDVR is generally mitigated by the under voltage load shedding (UVLS)-based control strategy. The data set created here contains the trajectory data of voltage dynamics under different control actions generated by standard UVLS strategy and random noise. We present the structure of the data set and its application in learning a dynamic surrogate model. Finally, other suitable ML-based applications of the given data set are discussed, thereby helping to strengthen reusable science practices. |
Tianzhixi Yin · Renke Huang · Ramij Raja Hossain · Qiuhua Huang · Jie Tan · Wenhao Yu 🔗 |
-
|
Differentiable, End-to-End Forward Modeling for 21 cm Cosmology: Robust Systematics Error Budgeting and More
(
Poster
)
>
A new generation of radio telescopes are being built to map the growth of cosmological structure throughout the majority of the observable universe, giving us access to new cosmological information that will shed light on outstanding questions in astrophysics and cosmology. These telescopes use 21\,cm emission from neutral hydrogen as a tracer of structure, but at the low radio frequencies that they operate face a daunting systematics suppression challenge. These systematics are wide ranging, and are generally considerably brighter than the underlying cosmological signal of interest, setting up a delicate signal separation problem that has yet to be overcome by the field. We present a new framework based on differentiable forward models that will enable the joint modeling of systematics in an end-to-end manner for the first time, allowing us to better subtract low-level systematics and compute more robust errorbars. This framework is made possible by high-performance machine learning frameworks that use automatic differentiation to quickly compute exact posterior gradients that are then fed to gradient-aware optimization and posterior sampling routines. |
Nicholas Kern 🔗 |
-
|
Investigating the Ability of PINNs To Solve Burgers’ PDE Near Finite-Time BlowUp
(
Poster
)
>
Physics Informed Neural Networks (PINNs) have been achieving ever newer feats of solving complicated PDEs numerically while offering an attractive trade-off between accuracy and speed of inference. A particularly challenging aspect of PDEs is that there exist simple PDEs which can evolve into singular solutions in finite time starting from smooth initial conditions. In recent times some striking experiments have suggested that PINNs might be good at even detecting such finite-time blow-ups. In this work, we embark on a program to investigate this stability of PINNs from a rigorous theoretical viewpoint. Firstly, we derive generalization bounds for PINNs for Burgers' PDE, in arbitrary dimensions, under conditions that allow for a finite-time blow-up. Then we demonstrate via experiments that our bounds are significantly correlated to the L2-distance of the neurally found surrogate from the true blow-up solution, when computed on sequences of PDEs that are getting increasingly close to a blow-up. |
Dibyakanti Kumar · Anirbit Mukherjee 🔗 |
-
|
Detection and Segmentation of Ice Blocks in Europa's Chaos Terrain Using Mask R-CNN
(
Poster
)
>
link
The complex icy surface of Jupiter's moon, Europa, has long fascinated planetary science and astrobiology communities. NASA spacecraft observations of Europa have revealed an enigmatic 'chaos terrain,' characterized by jigsaw-like areas of broken ice blocks caused by significant past subsurface disruption events. Speculation suggests the ice crust in these regions may be thinner, potentially offering better access to a warm ocean that may harbor complex organic compounds. These regions are favorable targets for future solar system missions, and may offer additional insight into Europa's internal processes. Although substantial progress has been made in visually cataloging chaos terrain, the precise mapping of ice blocks is laborious, subjective, and resource-intensive. Leveraging the capabilities of machine learning (ML) algorithms to expedite and automate such tasks will be crucial to scale this effort to other solar system bodies. To address this, we explore using a Mask R-CNN and transfer learning to detect and segment individual ice blocks within chaos terrain. Our current model achieves a highest precision score of 71.8% and recall score of 67.6%. We present the current strengths and limitations of our model and dataset while outlining avenues for further improvement. This work aims to contribute to future mission planning for Europa and other solar system bodies. Additionally, it highlights the unique algorithmic challenges posed by planetary science data and emphasizes the need for innovative ML solutions. |
Marina Dunn · Conor Nixon · Alyssa Mills · Ahmed Awadallah · Ethan Duncan · John Santerre · Douglas Trent · Andrew Larsen 🔗 |
-
|
Neural Networks vs. Whittaker Smoothing: Advanced Techniques for Scattering Signal Removal in 3D Fluorescence spectra
(
Poster
)
>
Fluorescence excitation emission matrices (EEMs) have a trilinear structure, aligning perfectly with the tensor rank decomposition, PARAFAC. Consequently, PARAFAC has become essential for extracting information from freshwater EEMs, pinpointing individual fluorophore groups, and tracking their behavior across diverse environment. However, EEMs of seawaters, with typically low organic matter, are often dominated by Rayleigh and Raman scattering, which deviates from the trilinear model. Traditional one-dimensional interpolation to eliminate these interferences varies in outcome based on its matrix application direction and struggles with noisy data. Our proposed techniques, employing Whittaker smoothing and CNN, effectively eliminate scattering signals, even in noise-rich scenarios. Notably, CNN adeptly preserves the overall EEM shape across various sizes and dimensions, establishing itself as an optimal choice for interpolating scattering zones in EEMs of organic matter-deficient freshwaters. |
Aleksandr Zakuskin · Ivan Krylov · Timur Labutin 🔗 |
-
|
Benchmarking of Fast and Interpretable UF Machine Learning Potentials
(
Poster
)
>
Ab initio methods offer great promise for materials design, but they come with a hefty computational cost. Recent advances with Machine Learning potentials (MLPs) have revolutionized molecular dynamic simulations by providing high accuracies similar to ab initio models but at much reduced computational cost. Our study evaluates the Ultra-Fast Potential (UF3), employing linear regression with cubic B-spline basis for assessing effective two- and three-body potentials. On benchmarking, UF3 displays comparable precision to established models like GAP, MTP, NNP(Behler Parrinello), and qSNAP MLPs, yet is significantly faster by two to three orders of magnitude. A distinct feature of UF3 is its capability to render visual representations of learned two- and three-body potentials, shedding light on potential gaps in the learning model. In refining UF3's performance, a comprehensive sweep of the hyperparameter space was undertaken, emphasizing finer granularity in zones indicative of optimal performance. This endeavor aims to provide insights into the UF3 hyperparameter space smoothness, and offer users a foundational set of default set of hyperparameters as a starting point for optimization. While our current optimizations are concentrated on energies and forces, we are primed to broaden UF3’s evaluation spectrum, focusing on its applicability in critical areas of Molecular Dynamics simulations. The outcome of these investigations will not only enhance the predictability and usability of UF3 but also pave the way for its broader applications in advanced materials discovery and simulations. |
Pawan Prakash 🔗 |
-
|
A Physics-Informed Variational Autoencoder for Rapid Galaxy Inference and Anomaly Detection
(
Poster
)
>
The Vera C. Rubin Observatory is slated to observe nearly 20 billion galaxies during its decade-long Legacy Survey of Space and Time. The rich imaging data it collects will be an invaluable resource for probing galaxy evolution across cosmic time, characterizing the host galaxies of transient phenomena, and identifying novel populations of anomalous systems. To facilitate these studies, we introduce a convolutional variational autoencoder trained to rapidly estimate the redshift, stellar mass, and star-formation rate of galaxies from multi-band imaging data. We show that our CVAE can be used to identify physically-meaningful anomalies in large galaxy samples >100x faster than the leading parameter inference techniques. |
Alex Gagliano · Ashley Villar 🔗 |
-
|
Pythia: A prototype artificial agent for designing optimal gravitational-wave follow-up campaigns
(
Poster
)
>
Joint observations in electromagnetic and gravitational waves shed light on the physics of objects and surrounding environments with extreme gravity that are otherwise unreachable via siloed observations in each messenger. However, such detections remain challenging due to the rapid and faint nature of counterparts. Protocols for discovery and inference still rely on human experts manually inspecting survey alert streams and intuiting optimal usage of limited follow-up resources. Strategizing an optimal follow-up program requires adaptive sequential decision-making given evolving light curve data that maximizes a global objective despite incomplete information and is robust to stochasticity introduced by detectors/observing conditions. We design a novel reinforcement learning agent that executes such a design for the goal of maximizing follow-up photometry for the true kilonova among several contaminant transient light curves from the Zwicky Transient Facility. It achieves 3$\times$ higher accuracy compared to a random strategy and comes close to human-level performance. We suggest that more complex agents (e.g. using deep Q networks or policy gradient algorithms) could perform at par or surpass human experts. Agents like these could pave the way for machine-directed software infrastructure to efficiently respond to next generation detectors, for conducting science inference and optimally planning expensive follow-up observations, scalably and with demonstrable performance guarantees.
|
Niharika Sravan · Matthew Graham · Michael Coughlin · Shreya Anand · Tomas Ahumada 🔗 |
-
|
Probabilistic Reconstruction of Dark Matter fields from galaxies using diffusion models
(
Poster
)
>
Galaxies are biased tracers of the underlying cosmic web, which is dominated by dark matter components that cannot be directly observed. The relationship between dark matter density fields and galaxy distributions can be sensitive to assumptions in cosmology and astrophysical processes embedded in the galaxy formation models, that remain to be uncertain in many aspects. Based on state-of-the-art galaxy formation simulation suites with varied cosmological parameters and sub-grid astrophysics, we develop a diffusion generative model to predict the unbiased posterior distribution of the underlying dark matter fields from the given stellar mass fields, while being able to marginalizing over the uncertainties lying in cosmology and galaxy formation. |
Carolina Cuesta · Yueying Ni · Core Francisco Park · Nayantara Mudur · Victoria Ono 🔗 |
-
|
Predicting the Age of Astronomical Transients from Real-Time Multivariate Time Series
(
Poster
)
>
Astronomical transients, such as supernovae and other rare stellar explosions, have been instrumental in some of the most significant discoveries in astronomy. New astronomical sky surveys will soon record unprecedented numbers of transients as sparsely and irregularly sampled multivariate time series. To improve our understanding of the physical mechanisms of transients and their progenitor systems, early-time measurements are necessary. Prioritizing the follow-up of transients based on their age along with their class is crucial for new surveys. To meet this demand, we present the first method of predicting the age of transients in real-time from multi-wavelength time-series observations. We build a Bayesian probabilistic recurrent neural network. Our method can accurately predict the age of a transient with robust uncertainties as soon as it is initially triggered by a survey telescope. This work will be essential for the advancement of our understanding of the numerous young transients being detected by ongoing and upcoming astronomical surveys. |
Daniel Muthukrishna 🔗 |
-
|
Multiscale Feature Attribution for Outliers
(
Poster
)
>
Machine learning techniques can automatically identify outliers in massive datasets, much faster and more reproducible than human inspection ever could. But finding such outliers immediately leads to the question: which features render this input anomalous? We propose a new feature attribution method, Inverse Multiscale Occlusion, that is specifically designed for outliers, for which we have little knowledge of the type of features we want to identify and expect that the model performance is questionable because anomalous test data likely exceed the limits of the training data. We demonstrate our method on outliers detected in galaxy spectra from the Dark Energy Survey Instrument and find its results to be much more interpretable than alternative attribution approaches. |
Jeff Shen · Peter Melchior 🔗 |
-
|
Learning Reionization History from Quasars with Simulation-Based Inference
(
Poster
)
>
Understanding the entire history of the ionization state of the intergalactic medium (IGM) is at the frontier of astrophysics and cosmology. A promising method to achieve this is by extracting the damping wing signal from the neutral IGM. As hundreds of redshift z>6 quasars are observed, we anticipate determining the detailed time evolution of the ionization fraction with unprecedented fidelity. However, traditional approaches to parameter inference are not sufficiently accurate.We assess the performance of a simulation-based inference (SBI) method to infer the neutral fraction of the universe from quasar spectra. The SBI adeptly exploit the shape information of the damping wing, enabling precise estimations of the neutral fraction \xHIv and the wing position w_p. Importantly, the SBI framework successfully breaks the degeneracy between these two parameters, offering unbiased estimates of both. This makes the SBI superior to the traditional method using a pseudo-likelihood function. We anticipate that SBI will be essential to determine robustly the ionization history of the Universe through joint inference from the hundreds of high-(z) spectra we will observe. |
Huanqing Chen · Joshua Speagle · Keir Rogers 🔗 |
-
|
Interpretable Joint Event-Particle Reconstruction at NOvA with Sparse CNNs and Transformers
(
Poster
)
>
The complex events observed at the NOvA long-baseline neutrino oscillation experiment contain vital information for understanding the most elusive particles in the standard model. The NOvA detectors observe interactions of neutrinos from the NuMI beam at Fermilab. Associating the particles produced in these interaction events to their source particles, a process known as reconstruction, is critical for accurately measuring key parameters of the standard model. Events may contain several particles, each producing sparse high-dimensional spatial observations, and current methods are limited to evaluating individual particles. To accurately label these numerous, high-dimensional observations, we present a novel neural network architecture that combines the spatial learning enabled by convolutions with the contextual learning enabled by attention. This joint approach, TransformerCVN, simultaneously classifies each event and reconstructs every individual particle's identity. TransformerCVN classifies events with 90% accuracy and improves the reconstruction of individual particles by 6% over baseline methods which lack the integrated architecture of TransformerCVN. In addition, this architecture enables us to perform several interpretability studies which provide insights into the network's predictions and show that TransformerCVN discovers several fundamental principles that stem from the standard model. |
Alexander Shmakov · Alejandro Yankelevich · Jianming Bian · Pierre Baldi 🔗 |
-
|
SimSIMS: Simulation-based Supernova Ia Model Selection with thousands of latent variables
(
Poster
)
>
We present principled Bayesian model selection through simulation-based neural classification applied to SN Ia analysis. We validate our approach on realistically simulated SN Ia lightcurve data, demonstrating its ability to recover posterior model probabilities while marginalizing over > 4000 latent variables. We briefly explore the dependence of Bayes factors on the true parameters of simulated data, demonstrating Occam's razor for nested models. When applied to a sample of 86 low-redshift SNae Ia from the CSP, our method prefers a model with a single dust law and no magnitude step with host mass, while disfavouring different dust laws for low- and high-mass hosts with odds in excess of 100:1. |
Konstantin Karchev · Roberto Trotta · Christoph Weniger 🔗 |
-
|
Accelerating Kinetic Simulations of Electrostatic Plasmas with Reduced-Order Modeling
(
Poster
)
>
Despite the advancements in high-performance computing and modern numerical algorithms, the cost remains prohibitive for multi-query kinetic plasma simulations. In this work, we develop data-driven reduced-order models (ROM) for collisionless electrostatic plasma dynamics, based on the kinetic Vlasov-Poisson equation.Our ROM approach projects the equation onto a linear subspace defined by principal proper orthogonal decomposition (POD) modes. We introduce an efficient tensorial method to update the nonlinear term using a precomputed third-order tensor. We capture multiscale behavior with a minimal number of POD modes by decomposing the solution into multiple time windows using a physical-time indicator and creating a temporally-local ROM.Applied to 1D--1V simulations, specifically the benchmark two-stream instability case, our time-windowed reduced-order model (TW--ROM) with the tensorial approach solves the equation approximately 450 times faster than Eulerian simulations while maintaining a maximum relative error of 3% for the training data and 12% for the testing data. |
Ping-Hsuan Tsai · Kevin Chung · Debojyoti Ghosh · John Loffeld · Youngsoo Choi · Jon Belof 🔗 |
-
|
Sequential Monte Carlo for Detecting and Deblending Objects in Astronomical Images
(
Poster
)
>
Many of the objects imaged by the forthcoming generation of astronomical surveys will overlap visually. These objects are known as blends. Distinguishing and characterizing blended light sources is a challenging task, as there is inherent ambiguity in the type, position, and properties of each source. We propose SMC-Deblender, a novel approach to probabilistic astronomical cataloging based on sequential Monte Carlo (SMC). Given an image, SMC-Deblender evaluates catalogs with various source counts by partitioning the SMC particles into blocks. With this technique, we demonstrate that SMC can be a viable alternative to existing deblending methods based on Markov chain Monte Carlo and variational inference. In experiments with ambiguous synthetic images of crowded starfields, SMC-Deblender accurately detects and deblends sources, a task which proves infeasible for Source Extractor, a widely used non-probabilistic cataloging program. |
Tim White · Jeffrey Regier 🔗 |
-
|
DeepSurveySim: Simulation Software and Benchmark Challenges for Astronomical Observation Scheduling
(
Poster
)
>
Modern astronomical surveys have multiple competing scientific goals.Optimizing the observation schedule for these goals presents significant computational and theoretical challenges, and state-of-the-art methods rely on expensive human inspection of simulated telescope schedules. Automated methods, such as reinforcement learning, have recently been explored to accelerate scheduling. However, there do not yet exist benchmark data sets or user-friendly software frameworks for testing and comparing these methods. We present DeepSurveySim -- a high-fidelity and flexible simulation tool for use in telescope scheduling. DeepSurveySim provides methods for tracking and approximating sky conditions for a set of observations from a user-supplied telescope configuration. We envision this tool being used to produce benchmark data sets and for evaluating the efficacy of ground-based telescope scheduling algorithms. We introduce three example survey configurations and related code implementations as benchmark problems that can be simulated with DeepSurveySim. |
M Voetberg · Brian Nord 🔗 |
-
|
LoDIP: Low-dose phase retrieval with deep image prior
(
Poster
)
>
Phase retrieval under very low dose conditions is a challenging problem as all the phase retrieval algorithms become unstable with the presence of very high Poisson noise. To mitigate this problem, in-situ coherent diffractive imaging (CDI) has been previously proposed, consisting of a static region of strong scatterers and a dynamic region of a sample. The static region is illuminated by a very high dose, while the dynamic region is irradiated by a very low dose, producing a coherent interference pattern from the two regions. Iterative phase retrieval algorithms are then used to reconstruct both regions from the diffraction patterns with high signal to noise ratio. Numerical simulations have indicated that in-situ CDI can reduce radiation dose by one to two orders of magnitude over conventional CDI. Here we demonstrate low-dose phase retrieval with deep image prior, termed LoDIP, for in-situ CDI. Using both numerical and experimental data, we demonstrate that LoDIP outperfroms popular iterative phase retrieval algorithms under low-dose conditions. Our results show that LoDIP is not sensitive to the choice of the static structure nor to the geometric arrangement between the two objects. Additionally, unlike previous successful work with in situ CDI, LoDIP does not depend on multiple measurements with a common static region. We expect that the combination of deep-learning phase retrieval with in situ CDI will create numerous opportunities for high-resolution quantitative phase imaging for dose-sensitive materials, such as biological samples, polymers, organic semiconductors, and energy materials. |
Raunak Manekar · Elisa Negrini · Minh Pham · Daniel Jacobs · Jaideep Srivastava · Stanley Osher · Jianwei Miao 🔗 |
-
|
Bayesian multi-band fitting of alerts for kilonovae detection
(
Poster
)
>
In the era of multi-messenger astronomy, early classification of photometric alerts from wide-field and high-cadence surveys is a necessity to trigger spectroscopic follow-ups. These classifications are expected to play a key role in identifying potential candidates that might have a corresponding gravitational wave (GW) signature. Machine learning classifiers using features from parametric fitting of light curves are widely deployed by broker software to analyze millions of alerts, but most of these algorithms require as many points in the filter as the number of parameters to produce the fit, which increases the chances of missing a short transient.Moreover, the classifiers are not able to account for the uncertainty in the fits when producing the final score.In this context, we present a novel classification strategy that incorporates data-driven priors for extracting a joint posterior distribution of fit parameters and hence obtaining a distribution of classification scores. We train and test a classifier to identify kilonovae events which originate from binary neutron star mergers or neutron star black hole mergers, among simulations for the Zwicky Transient Facility observations with 19 other non-kilonovae-type events. We demonstrate that our method is able to correctly estimate the uncertainty of misclassification, and the average classification score obtains an AUC score of 0.96 on simulated data. We further show that using this method we can process the entire alert steam in real-time and bring down the sample of probable events to a scale where they can be analyzed by domain experts. |
Biswajit Biswas 🔗 |
-
|
Forward Gradients for Data-Driven CFD Wall Models
(
Poster
)
>
Computational Fluid Dynamics (CFD) is used in the design and optimization of gas turbines and many other industrial/ scientific applications. However, the practical use is often limited by the high computational cost, and the accurate resolution of near-wall flow is a significant contributor to this cost. Machine learning (ML) and other data-driven methods can complement existing wall models. Nevertheless, training these models is bottlenecked by the large computational effort and memory footprint demanded by back-propagation. Recent work has presented alternatives for computing gradients of neural networks where a separate forward and backward sweep is not needed and storage of intermediate results between sweeps is not required because an unbiased estimator for the gradient is computed in a single forward sweep. In this paper, we discuss the application of this approach for training a subgrid wall model that could potentially be used as a surrogate in wall-bounded flow CFD simulations to reduce the computational overhead while preserving predictive accuracy. |
Jan Hueckelheim · Tadbhagya Kumar · Krishnan Raghavan · Pinaki Pal 🔗 |
-
|
Learning an Effective Evolution Equation for Particle-Mesh Simulations Across Cosmologies
(
Poster
)
>
Particle-mesh simulations trade small-scale accuracy for speed compared to traditional, computationally expensive N-body codes in cosmological simulations. In this work, we show how a learning-based model could be used to learn an effective evolution equation for the particles, by correcting the errors of the particle-mesh potential incurred on small scales during simulations. We find that our learnt correction yields evolution equations that generalize well to new, unseen initial conditions and cosmologies. We further demonstrate that the resulting corrected maps can be used in a simulation-based inference framework to yield an unbiased inference of cosmological parameters. The model, a network implemented in Fourier space, is exclusively trained on the particle positions and velocities. This work is of particular importance in the context where, in the coming decade, cosmology will be transformed by unprecedented volumes of survey data from multi-billion-dollar instruments, and extracting all the information from these datasets will require fast and accurate cosmological simulators which are not yet available. |
Nicolas Payot · Pablo Lemos · Laurence Perreault-Levasseur · Carolina Cuesta · Chirag Modi · Yashar Hezaveh 🔗 |
-
|
Active Learning for Discovering Complex Phase Diagrams with Gaussian Processes
(
Poster
)
>
We introduce a Bayesian active learning algorithm that efficiently elucidates phase diagrams. Using a novel acquisition function that assesses both the impact and likelihood of the next observation, the algorithm iteratively determines the most informative next experiment to conduct and rapidly discerns the phase diagrams with multiple phases. Comparative studies against existing methods highlight the superior efficiency of our approach. We demonstrate the algorithm's practical application through the successful identification of a skyrmion phase diagram. |
Max Zhu · Jian Yao · Marcus Mynatt · Hubert Pugzlys · Shuyi Li · Qingyuan Zhao · Chunjing Jia 🔗 |
-
|
RACER: Rational Artificial Intelligence Car-following-model Enhanced by Reality
(
Poster
)
>
This paper introduces RACER, the Rational Artificial Intelligence Car-following model Enhanced by Reality, a cutting-edge deep learning car-following model, which satisfies partial derivative constraints that are necessary to maintain physical feasibility, designed to predict Adaptive Cruise Control (ACC) driving behavior. Unlike conventional car-following models, RACER effectively integrates Rational Driving Constraints (RDC), crucial tenets of actual driving, resulting in strikingly accurate and realistic predictions. Notably, it adherence to the RDC, registering zero violations, in stark contrast to other models. This study incorporates physical constraints within AI models, especially for obeying rational behaviors in transportation. The versatility of the proposed model, including its potential to incorporate additional derivative constraints and broader architectural applications, enhances its appeal and broadens its impact within the scientific community. |
Tianyi Li · Raphael Stern 🔗 |
-
|
Learned integration contour deformation for signal-to-noise improvement in Monte Carlo calculations
(
Poster
)
>
Calculations of the strong nuclear interactions, encoded in the theory of Quantum Chromodynamics (QCD), are extraordinarily computationally demanding. Inparticular, the Monte Carlo integration used in lattice field theory calculations in this context suffers from severe signal-to-noise challenges. Complexifying the integration manifold with the complex contour deformation method reduces the variances of observables while guaranteeing the exactness of the results. In this work, we use convolutional neural networks to parametrize the deformed manifolds and demonstrate orders-of-magnitude reduction in the variance of a key observable (the Wilson loop) in a simplified model of QCD in three spacetime dimensions. |
William Detmold · Gurtej Kanwar · Yin Lin · Phiala Shanahan · Michael Wagman 🔗 |
-
|
The search for the lost attractor
(
Poster
)
>
N-body systems characterized by $r^{-2}$ attractive forces may display a self similar collapse known as the gravo-thermal catastrophe. In star clusters, collapse is halted by binary stars, and a large fraction of Milky Way clusters may have already reached this phase.It has been speculated -with guidance from simulations- that macroscopic variables such as central density and velocity dispersion are governed post-collapse by an effective, low-dimensional system of ODEs. It is still hard to distinguish chaotic, low dimensional motion, from high dimensional stochastic noise. Here we apply three machine learning tools to state-of-the-art dynamical simulations to constrain the post collapse dynamics: topological data analysis (TDA) on a lag embedding of the relevant time series, Sparse Identification of Nonlinear Dynamics (SINDY), and Tests of Accuracy with Random Points (TARP).
|
Mario Pasquato · Syphax Haddad · Pierfrancesco Di Cintio · Alexandre Adam · Noé Dia · Mircea Petrache · Ugo Niccolò Di Carlo · Alessandro Alberto Trani · Laurence Perreault-Levasseur · Yashar Hezaveh · Pablo Lemos
|
-
|
Field Emulation and Parameter Inference with Diffusion Models
(
Poster
)
>
We use diffusion generative models to address two tasks of importance to cosmology -- as an emulator for cold dark matter density fields conditional on input cosmological parameters $\Omega_m$ and $\sigma_8$, and as a parameter inference model that can return constraints on the cosmological parameters of an input field. We show that the model is able to generate fields with power spectra that are consistent with those of the simulated target distribution, and capture the subtle effect of each parameter on modulations in the power spectrum. We additionally explore their utility as parameter inference models and find that we can obtain tight constraints on cosmological parameters.
|
Nayantara Mudur · Carolina Cuesta · Douglas P. Finkbeiner 🔗 |
-
|
Symbolic Machine Learning for High Energy Physics Calculations
(
Poster
)
>
The calculation of cross sections is of paramount importance in high-energy physics. Among other steps, this process involves squaring the particle interaction amplitudes, which can be very computationally expensive. These lengthy calculations are currently done using domain-specific symbolic algebra tools. We demonstrate that a transformer model, when trained on symbolic sequence pairs, can predict correctly the squared amplitudes of the Standard Model processes, namely QED, QCD and EW with an accuracy of 98\%, 97\% and 95\% , respectively, at a speed that is up to two orders of magnitude faster than current symbolic computation frameworks. We briefly note some limitations of the model and suggest possible future directions for this work. |
Abdulhakim Alnuqaydan · Sergei Gleyzer · Harrison Prosper · Eric Reinhardt · Francois Charton · Neeraj Anand 🔗 |
-
|
Autoencoding Labeled Interpolator, Inferring Parameters From Image And Image From Parameters
(
Poster
)
>
The Event Horizon Telescope (EHT) provides an avenue to study black hole accretion flows on event-horizon scales. Traditionally, fitting a semi-analytical model to EHT observations requires the construction of synthetic images, which is computationally expensive. This study presents an image generating tool in the form of a generative machine learning model, which extends the capabilities of a variational autoencoder. This tool can rapidly and continuously interpolate between a training set of images and can retrieve the defining parameters of those images. Trained on a curated set of synthetic black hole images, our tool showcases success in both interpolating and generating images, and retrieving the physical parameters. By reducing the computational cost of generating an image, this tool facilitates parameter estimation and model validation for observations of black hole systems. |
Ali SaraerToosi · Avery Broderick 🔗 |
-
|
Leveraging Deep Learning for Physical Model Bias of Global Air Quality Estimates
(
Poster
)
>
Air pollution is the world’s largest environmental risk factor for human disease and premature death, resulting in more than 6 million premature deaths in 2019. Currently, there is still a challenge to model one of the most important air pollutants, surface ozone (O3), particularly at scales relevant for human health impacts, with the drivers of global ozone trends at these scales largely unknown, limiting the practical use of physics-based models. We employ a 2-D Convolutional Neural Network (CNN)-based U-Net architecture that estimates surface ozone MOMO-Chem model residuals, referred to as model bias. We demonstrate the potential of this technique in North America and Europe, highlighting its ability better to capture physical model residuals compared to a traditional machine learning method. We assess the impact of incorporating land use information from high-resolution satellite imagery to improve model estimates. Importantly, we discuss how our results can improve our scientific understanding of the factors impacting ozone bias at urban scales that can be used to improve environmental policy. |
Kelsey Doerksen · Yarin Gal · Freddie Kalaitzis · Yuliya Marchetti · Steven Lu · James Montgomery · Kazuyuki Miyazaki · Kevin Bowman 🔗 |
-
|
Towards data-driven models of hadronization
(
Poster
)
>
This paper introduces two novel machine learning based approaches to improvehadron-level simulation by integrating experimental observables: MicroscopicAlterations Generated from IR Collections (MAGIC), which fine-tunes normaliz-ing flows, pre-trained on simulated data from P YTHIA , on experimental observables,and the Collective Reweighting Method (CRM), which reweights existing fragmen-tation functions to match experimental observables with a two-step procedure thatmakes use of a observable-level classifier and hadron-level particle cloud-basedregressor. Both methods show a promising direction towards data-driven modelsfor hadronization. |
Christian Bierlich · Philip Ilten · Tony Menzo · Stephen Mrenna · Manuel Szewc · Michael K. Wilkinson · Ahmed Youssef · Jure Zupan 🔗 |
-
|
From Plateaus to Progress: Unveiling Training Dynamics of PINNs
(
Poster
)
>
Physics Informed Neural Networks (PINNs) promise performance gains in solving Partial Differential Equations related to diverse applications. Yet, their training can be challenging, attributed in part to their unique loss function components. This study examines the optimization trajectory of PINNs for the heat equation, comparing it to a similarly-architected regression model. Our initial findings suggest that PINNs experience prolonged plateaus and unstable training behaviors predominantly due to misaligned update step. This research shines a light on underlying training dynamics, paving the way for improved PINN training methods. |
Daniel Lengyel · Panos Parpas · Rahil Pandya 🔗 |
-
|
Equivariant Neural Networks for Signatures of Dark Matter Morphology in Strong Lensing Data
(
Poster
)
>
One of the most promising avenues to study dark matter is from its interactions with gravity. In particular, it is well known that dark matter can be studied from the effect of its substructure in strong galaxy-galaxy lensing images. However, in practice, this is a very challenging problem to solve as the lensing signature is a sub-dominant effect, relative to that from the main halo, and there are also many systematics which are hard to account for. To circumvent these issues, machine learning has been studied extensively in the context of lensing to circumvent exactly these problems. Indeed, deep learning methods have the potential to accurately identify images containing substructure accurately. Most applications of machine learning to strong lensing rely on using convolution neural networks (CNN). In this work, we study the performance of equivariant neural networks (ENN) using simulated strong galaxy-galaxy lensing images as a means to study dark matter. We find that equivariant neural networks outperform state-of-the-art CNNs in both classification and regression tasks. This suggests that ENNs may be better suited for future lensing studies. |
Geo Jolly Cheeramvelil · Michael Toomey · Sergei Gleyzer 🔗 |
-
|
Echoes in the Noise: Posterior Samples of Faint Galaxy Surface Brightness Profiles with Score-Based Likelihoods and Priors
(
Poster
)
>
Examining the detailed structure of galaxies populations provides valuable insights into their formation and evolution mechanisms. Significant barriers to such analysis are the non-trivial noise properties of real astronomical images and the point spread function (PSF) which blurs structure. Here we present a framework which combines recent advances in score based likelihood characterization and diffusion model priors to perform a true Bayesian analysis of image deconvolution. Our technique, when applied to minimally processed Hubble Space Telescope (\emph{HST}) data, recovers structures which have otherwise only become visible in next generation James Webb Space Telescope (\emph{JWST}) imaging. |
Alexandre Adam · Connor Stone · Connor Bottrell · Ronan Legin · Laurence Perreault-Levasseur · Yashar Hezaveh 🔗 |
-
|
Deep Learning Segmentation of Spiral Arms and Bars
(
Poster
)
>
We present the first deep learning model for segmenting galactic spiral arms and bars. In a blinded assessment by expert astronomers, our predicted spiral arm masks are preferred over both current automated methods (99% of evaluations) and our original volunteer labels (79% of evaluations). Experts rated our spiral arm masks as |
Mike Walmsley · Ashley Spindler 🔗 |
-
|
Accelerating Flow Simulations using Online Dynamic Mode Decomposition
(
Poster
)
>
We develop an on-the-fly reduced-order model (ROM) integrated with a flow simulation, gradually replacing a corresponding full-order model (FOM) of a physics solver. Unlike offline methods requiring a separate FOM--only simulation prior to model reduction, our approach constructs a ROM dynamically during the simulation, replacing the FOM when deemed credible. Dynamic mode decomposition (DMD) is employed for online ROM construction, with a single snapshot vector used for rank-1 updates in each iteration. Demonstrated on a flow over a cylinder with Re=100, our hybrid FOM/ROM simulation is verified in terms of the Strouhal number, resulting in a 1.6-times speedup compared to the FOM solver. |
Seung Won Suh · Kevin Chung · Timo Bremer · Youngsoo Choi 🔗 |
-
|
Sparse 3D Images: Point Cloud or Image methods?
(
Poster
)
>
Score based generative models are a new class of generative models that have been shown to accurately generate high dimensional datasets. Recent advances in generative models have used images with 3D voxels to represent and model complex detector data. Point clouds, however, are likely a more natural representation for many of these data sets, particularly in calorimeters with high granularity that produce very sparse images. Point clouds preserve all of the information of the original simulation, more naturally deal with sparse datasets, and can be implemented with more compact models and datasets. In this work, two state-of-the-art score based models are trained on the same set of calorimeter simulation and directly compared. |
Fernando Torales Acosta · Vinicius Mikuni · Benjamin Nachman · Miguel Arratia · Bishnu Karki · Ryan Milton · Piyush Karande · Aaron Angerami 🔗 |
-
|
Classification under Prior Probability Shift in Simulator-Based Inference: Application to Atmospheric Cosmic-Ray Showers
(
Poster
)
>
High-energy cosmic rays are informative probes of astrophysical sources in our galaxy. A main challenge is to separate gamma showers (extremely rare events of interest) from the vast majority of hadron showers, when we have access to realistic simulations of the shower production (forward process) but the prior distribution on the shower parameters is unknown. Direct classification of the showers using output data leads to biased predictions and invalid uncertainty estimates, since the prior is chosen by design and is different from the true distribution. We overcome these biases by proposing a new method that casts classification as a hypothesis testing problem under nuisance parameters. The main idea is to estimate ROC curves as a function of all nuisances, devising selection criteria that are valid under a generalized prior probability shift over both shower label and nuisance parameters. Our method yields a set-valued classifier that returns valid confidence sets for all levels alpha simultaneously without having to retrain the classifier for each level. |
Alexander Shen · Ann Lee · Luca Masserano · tommaso dorigo · Michele Doro · Rafael Izbicki 🔗 |
-
|
Rare Galaxy Classes Identified In Foundation Model Representations
(
Poster
)
>
We identify rare and visually distinctive galaxy populations by searching for structure within the learned representations of pretrained models. We show that these representations arrange galaxies by appearance in patterns beyond those needed to predict the pretraining labels. We design a clustering approach to isolate specific local patterns, revealing groups of galaxies with rare and scientifically-interesting morphologies. |
Mike Walmsley · Anna Scaife 🔗 |
-
|
Understanding and Visualizing Droplet Distributions in Simulations of Shallow Clouds
(
Poster
)
>
Thorough analysis of local droplet-level interactions is crucial to better understand the microphysical processes in clouds and their effect on the global climate. High-accuracy simulations of relevant droplet size distributions from Large Eddy Simulations (LES) of bin microphysics challenge current analysis techniques due to their high dimensionality involving three spatial dimensions, time, and a continuous range of droplet sizes. Utilizing the compact latent representations from Variational Autoencoders (VAEs), we produce novel and intuitive visualizations for the organization of droplet sizes and their evolution over time beyond what is possible with clustering techniques. This greatly improves interpretation and allows us to examine aerosol-cloud interactions by contrasting simulations with different aerosol concentrations. We find that the evolution of the droplet spectrum is similar across aerosol levels but occurs at different paces. This similarity suggests that precipitation initiation processes are alike despite variations in onset times. |
Justus C. Will · Andrea Jenney · Kara Lamb · Mike Pritchard · Colleen Kaul · Po-Lun Ma · Jacob Shpund · Kyle Pressel · Marcus van Lier-Walqui · Stephan Mandt 🔗 |
-
|
Beyond PID Controllers: PPO with Neuralized PID Policy for Proton Beam Intensity Control in Mu2e
(
Poster
)
>
We introduce a novel Proximal Policy Optimization (PPO) algorithm aimed at addressing the challenge of maintaining a uniform proton beam intensity delivery in the Muon to Electron Conversion Experiment (Mu2e) at Fermi National Accelerator Laboratory (Fermilab). Our primary objective is to regulate the spill process to ensure a consistent intensity profile, with the ultimate goal of creating an automated controller capable of providing real-time feedback and calibration of the Spill Regulation System (SRS) parameters on a millisecond timescale. We treat the Mu2e accelerator system as a Markov Decision Process suitable for Reinforcement Learning (RL), utilizing PPO to reduce bias and enhance training stability. A key innovation in our approach is the integration of neuralized PID controller into the policy function, resulting in a significant improvement in the Spill Duty Factor (SDF) by 9.4\%, surpassing the performance of the current PID controller baseline by an additional 2.2\%. This paper presents the preliminary offline results based on a differentiable simulator of the Mu2e accelerator. It paves the ground works for real-time implementations and applications, representing a crucial step towards automated proton beam intensity control for the Mu2e experiment. |
Jerry Yao-Chieh Hu · Chenwei Xu · Aakaash Narayanan · Mattson Thieme · Vladimir Nagaslaev · Mark Austin · Jeremy Arnold · Jose Berlioz · Pierrick Hanlet · Aisha Ibrahim · Dennis Nicklaus · Jovan Mitrevski · Gauri Pradhan · Andrea Saewert · Kiyomi Seiya · Brian Schupbach · Randy Thurman-Keup · Nhan Tran · Rui Shi · Seda Ogrenci · Alexis Maya-Isabelle Shuping · Kyle Hazelwood · Han Liu
|
-
|
Loss Functionals for Learning Likelihood Ratios
(
Poster
)
>
The likelihood ratio is a crucial quantity for statistical inference that enables hypothesis testing, construction of confidence intervals, reweighting of distributions, and more. For modern data- or simulation-driven scientific research, however, computing the likelihood ratio can be difficult or even impossible. Approximations of the likelihood ratio may be computed using parametrizations of neural network-based classifiers. By evaluating four losses in approximating the likelihood ratio of univariate Gaussians and simulated high-energy particle physics datasets, we recommend particular configurations for each loss and propose a strategy to scan over generalized loss families for the best overall performance. |
Shahzar Rizvi · Mariel Pettee · Benjamin Nachman 🔗 |
-
|
19 Parameters Is All You Need: Tiny Neural Networks for Particle Physics
(
Poster
)
>
As particle accelerators increase their collision rates, and machine learning solutionsprove their reliability, the need for lightweight and fast neural network hardwareimplementations grows for low-latency tasks such as triggering. We examine thepotential of one recent Lorentz- and permutation-symmetric architecture, PELICAN,and present its instances with as few as 19 trainable parameters that outperformgeneric architectures with tens of thousands of parameters when compared on thebinary classification task of top quark jet tagging. |
Alexander Bogatskiy · Timothy Hoffman · Jan Offermann 🔗 |
-
|
CP-PINNs: Changepoints Detection in PDEs using Physics Informed Neural Networks with Total-Variation Penalty
(
Poster
)
>
The paper shows that Physics-Informed Neural Networks (PINNs) can fail to estimate the correct Partial Differential Equations (PDEs) dynamics in cases of unknown changepoints in the parameters. To address this, we propose a new CP-PINNs model which integrates PINNs with Total-Variation penalty for accurate changepoints detection and PDEs discovery. In order to optimally combine the tasks of model fitting, PDEs discovery, and changepoints detection, we develop a new meta-learning algorithm that exploits batch learning to dynamically refines the optimization objective when moving over the consecutive batches of the data. Empirically, in case of changepoints in the dynamics, our approach demonstrates accurate parameter estimation and model alignment, and in case of no changepoints in the data, it converges numerically to the solution from the original PINNs model. |
Zhikang Dong · Pawel Polak 🔗 |
-
|
Self-Driving Telescopes: Autonomous Scheduling of Astronomical Observation Campaigns with Offline Reinforcement Learning
(
Poster
)
>
Modern astronomical experiments are designed to achieve multiple scientific goals -- from studies of galaxy evolution to cosmic acceleration. These goals require data of many different classes of night-sky objects, each of which has a particular set of observational needs. These observational needs are typically in strong competition with one another. This poses a challenging multi-objective optimization problem that remains unsolved.The effectiveness of Reinforcement Learning (RL) as a valuable paradigm for training autonomous systems has been well-demonstrated, and it may provide the basis for self-driving telescopes capable of optimizing the scheduling for astronomy campaigns.Simulated datasets containing examples of interactions between a telescope and a discrete set of sky locations on the celestial sphere can be used to train an RL model to sequentially gather data from these several locations to maximize a cumulative reward as a measure of the quality of the data gathered.We use simulated data to test and compare multiple implementations of a Deep Q-Network (DQN) for the task of optimizing the schedule of observations from the Stone Edge Observatory (SEO). We combine multiple improvements on the DQN and adjustments to the dataset, showing that DQNs can achieve an average reward of 87%±6% of the maximum achievable reward in each state on the test set.This is the first comparison of offline RL algorithms for a particular astronomical challenge and the first open-source framework for performing such a comparison and assessment task. |
Franco Terranova · M Voetberg · Brian Nord · Amanda Pagul 🔗 |
-
|
High-dimensional and Permutation Invariant Anomaly Detection with Diffusion Generative Models
(
Poster
)
>
Methods for anomaly detection of new physics processes are often limited to low-dimensional spaces due to the difficulty of learning high-dimensional probability densities. Particularly at the constituent level, incorporating desirable properties such as permutation invariance and variable-length inputs becomes difficult within popular density estimation methods. In this work, we introduce a permutation-invariant density estimator for particle physics data based on diffusion models, specifically designed to handle variable-length inputs. We demonstrate the efficacy of our methodology by utilizing the learned density as a permutation-invariant anomaly detection score, effectively identifying jets with low likelihood under the background-only hypothesis. To validate our density estimation method, we investigate the ratio of learned densities and compare to those obtained by a supervised classification algorithm. |
Vinicius Mikuni · Benjamin Nachman 🔗 |
-
|
Generative Diffusion Models for Lattice Field Theory
(
Poster
)
>
This study delves into the connection between machine learning and lattice field theory by linking generative diffusion models (DMs) with stochastic quantization (SQ), from a stochastic differential equation(SDE) perspective. We show that DMs can be conceptualized by reversing a stochastic process driven by the Langevin equation, which then produces samples from an initial distribution to approximate the target distribution. In a toy model, we highlight the capability of DMs to learn effective actions. Furthermore, we demonstrate its feasibility to act as a global sampler for generating configurations in the two-dimensional $\phi^4$ quantum lattice field theory.
|
Lingxiao Wang · Gert Aarts · Kai Zhou 🔗 |
-
|
Reconstruction of Fields from Sparse Sensing: Differentiable Sensor Placement Enhances Generalization
(
Poster
)
>
Recreating complex, high-dimensional global fields from limited data points is a grand challenge across various scientific and industrial domains. Given the prohibitive costs of specialized sensors and the frequent inaccessibility of certain regions of the domain, achieving full field coverage is typically not feasible. Therefore, the development of algorithms that intelligently improve sensor placement is of significant value. In this study, we introduce a general approach that employs differentiable programming to exploit sensor placement within the training of a neural network model in order to improve field reconstruction. We evaluated our method using two distinct datasets; the results show that our approach improved test scores. Ultimately, our method of differentiable placement strategies has the potential to significantly increase data collection efficiency, enable more thorough area coverage, and reduce redundancy in sensor deployment. |
Agnese Marcato · Daniel O'Malley · Hari Viswanathan · Eric Guiltinan · Javier E. Santos 🔗 |
-
|
Learning Dark Matter Representation from Strong Lensing Images through Self-Supervision
(
Poster
)
>
Gravitational lensing is one of the most important probes of dark matter and has recently seen a surge in applications of machine learning techniques. This is typically studied in the context of supervised learning, but given the upcoming influx of gravitational lensing data from Euclid and LSST, manual labeling for deep learning tasks has become an unsustainable approach. To address this challenge, self-supervised learning (SSL) emerges as a scalable solution. By leveraging unlabeled strong lensing data to learn feature representations, self-supervised models have the potential to enhance our understanding of dark matter via the effect of its substructure in strong lensing images. This work implements contrastive learning, Bootstrap Your Own Latent (BYOL), Simple Siamese (SimSiam), and self-distillation with no labels (DINO) using ResNet50 and Vision Transformer (ViT) networks, to acquire unsupervised embeddings for strong lensing images simulated for different dark matter models: ultra-light axions, cold dark matter, and halos without substructure. The learned representations of the encoder are fine-tuned using supervision and applied to classification and regression tasks which are also benchmarked against a fully supervised, ResNet50 baseline. Our results show that the self-supervised methods can consistently outperform their supervised counterparts. |
Yashwardhan Deshmukh · Kartik Sachdev · Michael Toomey · Sergei Gleyzer 🔗 |
-
|
Graph Neural Networks for Identifying Protein Reactive Compounds
(
Poster
)
>
In chemistry, electrophilic and nucleophilic reactions are utilized in the design of new protein reactive drugs, identification of toxic compounds, and the exclusion of reactive compounds from high throughput screening. In particular, covalent drugs comprise a class of protein reactive compounds that have seen a lot of interest due to their potential advantages such as better selectivity, longer effective dose, and overcoming drug resistances. Despite that, there are currently no reliable screening tools that go beyond basic substructure matching. In this work, we demonstrate that graph neural networks models are capable of predicting covalent reactivity and capturing chemical motifs by looking at gradient activation heatmaps and how they correlate with chemical theory. We also propose a new dataset, ProteinReactiveDB, which was used to train graph-based models in this work. |
Victor Hugo Cano Gil · Christopher Rowley 🔗 |
-
|
Towards out-of-distribution generalization: robust networks learn similar representations
(
Poster
)
>
The generalization of machine learning (ML) models to out-of-distribution (OOD) examples remains a key challenge in extracting information from upcoming astronomical surveys. Interpretability approaches are a natural way to gain insights into the OOD generalization problem. We here use the Centered Kernel Alignment (CKA), a similarity measure metric of neural network representations, to examine the relationship between representation similarity and performance of pre-trained Convolutional Neural Networks (CNNs) on the CAMELS Multifield Dataset. We find that robust models, i.e., those that score high accuracy on both in-distribution (ID) and OOD data, learn similar representations, whereas non-robust models do not. We observe a strong correlation between similarity and accuracy in recovering cosmological parameters from three fields across the IllustrisTNG and SIMBA simulations. We discuss the potential application of similarity representation in guiding model design, training strategy, and mitigating the OOD problem by incorporating CKA as an inductive bias during training. |
Yash Gondhalekar · Sultan Hassan · Naomi Saphra · Sambatra Andrianomena 🔗 |
-
|
Towards an Astronomical Foundation Model for Stars
(
Poster
)
>
Rapid strides are currently being made in the field of artificial intelligence using Large Language Models (LLMs) with Transformers architecture. Aside from some use of the base technical components of Transformers---the attention mechanism---the real potential of Transformers in creating artificial intelligence in astronomy has not yet been explored. Here, we introduce a novel perspective on such model in data-driven astronomy by proposing a framework for astronomical data that use the same core techniques and architecture as used by natural-language LLMs. Using a variety of observations and labels of stars as an example, we build a prototype of a foundation model and we show that this model can be easily adapted and trained with cross-survey astronomical data sets. This single model has the ability to perform both discriminative and generative tasks even though the model was not trained to do any specific task that we test it on. This demonstrates that foundation models in astronomy are well within reach and will play a large role in the analysis of current and future large surveys. |
Henry Leung 🔗 |
-
|
Induced Generative Adversarial Particle Transformers
(
Poster
)
>
In high energy physics (HEP), machine learning methods have emerged as an effective way to accurately simulate particle collisions at the Large Hadron Collider (LHC). The message passing generative adversarial network (MPGAN) was the first model to simulate collisions as point, or |
Anni Li · Venkat Krishnamohan · Raghav Kansal · Javier Duarte · Rounak Sen · Steven Tsan · Zhaoyu Zhang 🔗 |
-
|
Lensformer: A Physics-Informed Vision Transformer for Gravitational Lensing
(
Poster
)
>
We introduce Lensformer, a state-of-the-art transformer architecture that incorporates the lens equations directly into the architecture for the purpose of studying dark matter in the context of strong gravitation lensing. This architecture combines the strengths of Transformer models from natural language processing with the analytical rigor of Physics-Informed Neural Networks (PINNs). By putting the lensing equation into the design of the architecture, Lensformer is able to accurately approximate the gravitational potential of the lensing galaxy. The physics-based features are then integrated into a Vision Transformer (ViT) neural network, which helps to provide a nuanced understanding when applied to various problems related to strong lensing. In this work we consider a toy example of classifying between simulations of different models of dark matter. To validate the model, we benchmark Lensformer against other leading architectures and demonstrate that it exhibits superior performance. |
Lucas José Velôso de Souza · Michael Toomey · Sergei Gleyzer 🔗 |
-
|
Self-supervised learning for searching jellyfish galaxies in the ocean of data from upcoming surveys
(
Poster
)
>
Human visual classification is the traditional approach to identifying jellyfish galaxies. However, this approach is unsuitable for large-scale galaxy surveys. In this study, we employ self-supervised learning on a dataset of approximately 200 images to extract semantically meaningful representations of galaxies. Despite the small dataset size, a similarity search suggests that the self-supervised representation space contains meaningful morphological information. We propose a framework for assigning JClass, a categorical disturbance measure, based on nearest-neighbor search in the self-supervised representation space to assist visual classifiers. Our pipeline is highly adaptable, allowing for the seamless identification of any rare astronomical signatures within astronomical datasets. |
Yash Gondhalekar · Rafael de Souza · Ana Chies Santos · Carolina Queiroz 🔗 |
-
|
deep-REMAP: Parameterization of Stellar Spectra Using Regularized Multi-Task Learning
(
Poster
)
>
Traditional spectral analysis methods are being pushed to their limits by exploding survey volumes. For efficient stellar characterization, accurate synthetic libraries and automated, interpretable techniques are needed. We develop a novel framework - deep-\underline{R}egularized \underline{E}nsemble-based \underline{M}ulti-task Learning with \underline{A}symmetric Loss for \underline{P}robabilistic Inference (deep-REMAP) - and show its effectiveness in predicting atmospheric parameters from observed spectra. We train our deep convolution neural network on PHOENIX, fine-tune on MARVELS FGK dwarfs, then predict effective temperature, surface gravity, and metallicity for FGK giants. To incorporate MARVELS peculiarities, we augment PHOENIX with realistic signatures. When validated on MARVELS calibration stars, the fine-tuned model recovers parameters and uncertainties, demonstrating effective transfer learning. While trained on PHOENIX for MARVELS, deep-REMAP is easily extended to other libraries, wavelengths, resolutions, and wider stellar properties. |
Sankalp Gilda 🔗 |
-
|
Bayesian Simulation-based Inference for Cosmological Initial Conditions
(
Poster
)
>
Reconstructing astrophysical and cosmological fields from observations is challenging. It requires accounting for non-linear transformations, mixing of spatial structure, and noise. In contrast, forward simulators that map fields to observations are readily available for many applications. We present a versatile Bayesian field reconstruction algorithm rooted in simulation-based inference and enhanced by autoregressive modeling. The proposed technique is applicable to generic (non-differentiable) forward simulators and allows sampling from the posterior for the underlying field. We show first promising results on a proof-of-concept application: the recovery of cosmological initial conditions from late-time density fields. |
Noemi Anau Montel · Florian List · Christoph Weniger 🔗 |
-
|
Autoregressive Transformers for Disruption Prediction in Nuclear Fusion Plasmas
(
Poster
)
>
The physical sciences require models tailored to specific nuances of different dynamics. In this work, we study outcome predictions in nuclear fusion tokamaks, where a major challenge are disruptions, or the loss of plasma stability with damaging implications for the tokamak. Although disruptions are difficult to model using physical simulations, machine learning (ML) models have shown promise in predicting these phenomena. Here, we first study several variations on masked autoregressive transformers, achieving an average of 5\% increase in Area Under the Receiving Operating Characteristic metric above existing methods. We then compare transformer models to limited context neural networks in order to shed light on the ``memory'' of plasma effected by tokamaks controls. With these model comparisons, we argue for the persistence of a memory throughout the plasma in the context of tokamaks that our model exploits. |
Lucas Spangher · William Arnold · Alexander Spangher · Andrew Maris · Cristina Rea 🔗 |
-
|
CaloFFJORD: High Fidelity Calorimeter Simulation Using Continuous Normalizing Flows
(
Poster
)
>
High fidelity simulation of detector components in collider physics is computationally expensive and often not scalable to the requirements of future experimental facilities. In this work, we present a fast and accurate alternative for detector simulation based on continuous normalizing flows for calorimeter simulation named CaloFFJORD, able to reproduce high-fidelity calorimeter responses in a fraction of the time compared to full simulation routines. We evaluate our model using different detector simulations and show that CaloFFJORD can improve the fidelity of detector simulations by incorporating data symmetries that are harder to encode within standard normalizing flow architectures. |
Chirag Furia · Vinicius Mikuni 🔗 |
-
|
Machine learning-assisted nanoscale photoelectrical sensing
(
Poster
)
>
The ability to non-invasively measure local conductivity and permittivity at the nanoscale is of fundamental importance in unraveling the physics of quantum systems. One approach is Microwave Impedance Microscopy (MIM), a scanning probe technique operating at microwave frequencies. However, the resulting large datasets and vast parameter space make obtaining a mapping between MIM measurements and local microscopic properties challenging. Here, we overcome this challenge by using machine learning to reconstruct the local properties while incorporating physical priors. The synergy between MIM and ML allows for the quantitative predictions of complex interactions between excitons, charge carriers, and the dielectric environment. This approach provides profound insights into the fundamental physics of excitons in two-dimensional materials. |
Ziyan Zhu · Zhurun (Judy) Ji · Houssam Yassin · Zhi-Xun Shen · Thomas Devereaux 🔗 |
-
|
Emulating deviations from Einstein's General Relativity using conditional GANs
(
Poster
)
>
Computationally expensive simulations pose a severe bottleneck, especially in astronomy, where several realizations of the same physical processes are required to facilitate scientific studies, such as exploring new physics or constraining the underlying physics by comparing it with observations. Simulations that modify Einstein's gravity require solving highly non-linear equations and take $\sim$10 times more time than the normal ones. In order to mitigate this bottleneck, we use a conditional generative adversarial network (cGAN) to map output fields from normal simulations to output fields of time-consuming simulations. Our model uses a frequency-based loss during training and uses indirect emulation wherein the mapping is achieved using ratio fields instead of the traditional input $\rightarrow$ output domain translation. Our cGAN agrees well with the ground-truth images despite the visually minor differences between fields from the input and output domains.
|
Yash Gondhalekar · Sownak Bose 🔗 |
-
|
Operator SVD with Neural Networks via Nested Low-Rank Approximation
(
Poster
)
>
link
This paper proposes an optimization-based method to learn the singular value decomposition (SVD) of a compact operator with ordered singular functions. The proposed objective function is based on Schmidt's low-rank approximation theorem (1907) that characterizes a truncated SVD as a solution minimizing the mean squared error, accompanied with a technique called \emph{nesting} to learn the ordered structure. When the optimization space is parameterized by neural networks, we refer to the proposed method as \emph{NeuralSVD}. The implementation does not require sophisticated optimization tricks unlike existing approaches. |
Jongha (Jon) Ryu · Xiangxiang Xu · Hasan Sabri Melihcan Erol · Yuheng Bu · Lizhong Zheng · Gregory Wornell 🔗 |
-
|
Gradient weighted physics-informed neural networks for capturing shocks in porous media flows
(
Poster
)
>
Physics-informed neural networks (PINNs) seamlessly integrate physical laws into machine learning models, enabling accurate simulations while preserving the underlying physics. However, PINNs are still suboptimal in approximating discontinuities in the form of shocks compared to the traditional numerical shock-capturing methods. This paper proposes a framework to approximate shocks arising in dynamic porous media flows by weighting the governing nonlinear partial differential equation (PDE) with a physical gradient-based term in the loss function. The applicability of the proposed framework is investigated on the forward problem of immiscible two-phase fluid transport in porous media governed by a nonlinear first-order hyperbolic Buckley–Leverett PDE. Particularly, convex and non-convex flux functions are studied involving shocks and rarefaction. The results demonstrate that the proposed framework consistently learns accurate approximations containing shocks and rarefaction by weighting the underlying PDE with a physical gradient term and outperforms state-of-the-art artificial viscosity-based neural network methods to capture shocks on the standard L2-norm metric. |
Somiya Kapoor · Abhishek Chandra · Taniya Kapoor · Mitrofan Curti 🔗 |
-
|
Physics-Informed Calibration of Aeromagnetic Compensation in Magnetic Navigation Systems using Liquid Time-Constant Networks
(
Poster
)
>
link
Magnetic navigation is a rising GPS alternative that has proven useful for airborne magnetic navigation (MagNav). External magnetic fields combine Earth’s crustal anomaly field and disruptions induced by aircraft electronics and Earth’s large-scale magnetic fields. We introduce an approach using Liquid Time-Constant Networks (LTCs) to minimize noise observed in airborne MagNav. LTCs can effectively model and remove the aircraft’s magnetic interference, improving the detection of weak anomalies. Using real flight data, we compare our system to traditional models and observe up to a 64% reduction in aeromagnetic compensation error (RMSE nT). This indicates our proposed approach can significantly improve the efficiency and reliability of airborne MagNav. |
Favour Nerrise · Andrew Sosanya · Patrick Neary 🔗 |
-
|
The DL Advocate: Playing the devil’s advocate with hidden systematic uncertainties
(
Poster
)
>
We propose a new method based on machine learning to play the devil's advocate and investigate the impact of unknown systematic effects in a quantitative way. This method proceeds by reversing the measurement process and using the physics results to interpret systematic effects under the Standard Model hypothesis.We explore this idea through a combination of gradient descent and optimisation techniques, its application and potentiality is illustrated with an example that studies the branching fraction measurement of a heavy-flavour decay.We find that the size of a hypothetical hidden systematic uncertainty strongly depends on the kinematic overlap between the signal and normalisation channel. |
Andrey Ustyuzhanin · Andrey Golutvin · Aleksandr Iniukhin · Patrick Owen · Andrea Mauri · Nicola Serra 🔗 |
-
|
MCMC to address model misspecification in Deep Learning classification of Radio Galaxies
(
Poster
)
>
The radio astronomy community is adopting deep learning techniques to deal with the huge data volumes expected from the next-generation of radio observatories. Bayesian neural networks (BNNs) provide a principled way to model uncertainty in the predictions made by deep learning models and will play an important role in extracting well-calibrated uncertainty estimates from the outputs of these models. However, most commonly used approximate Bayesian inference techniques such as variational inference and MCMC-based algorithms experience a "cold posterior effect (CPE)", according to which the posterior must be down-weighted in order to get good predictive performance. The CPE has been linked to several factors such as data augmentation or dataset curation leading to a misspecified likelihood and prior misspecification. In this work we use MCMC sampling to show that a Gaussian parametric family is a poor variational approximation to the true posterior and gives rise to the CPE previously observed in morphological classification of radio galaxies using variational inference based BNNs. |
Devina Mohan · Anna Scaife 🔗 |
-
|
Application of Zone Method based Physics-Informed Neural Networks in Reheating Furnaces
(
Poster
)
>
Foundation Industries (FIs), constitute glass, metals, cement, ceramics, bulk chemicals, paper, steel, etc, and provide crucial, foundational, materials for a diverse set of economically relevant industries: automobiles, machinery, construction, household appliances, chemicals, etc. Reheating furnaces within the manufacturing chain of FIs are energy-intensive. Deep Learning (DL) powered control systems could lead to notable energy consumption reduction by reducing the overall heating time in furnaces. This could help achieve the Net-Zero goals in FIs for sustainable manufacturing. In this work, due to the infeasibility of achieving good quality data in scenarios like reheating furnaces, classical Hottel's zone method based computational model has been used to generate data for DL based model training via regression. To further enhance the Out-Of-Distribution (OOD) generalization capability of the trained DL model, we propose a Physics-Informed Neural Network (PINN) by incorporating prior physical knowledge using a set of novel Energy-Balance regularizers. |
Ujjal Dutta · Aldo Lipani · Chuan Wang · Yukun Hu 🔗 |
-
|
LEO Satellite Orbit Prediction with Physics Informed Machine Learning
(
Poster
)
>
In recent space missions, the more complicated the missions become, the moreimportant autonomous spacecraft controllers are. In this study, we focus on aprecise autonomous orbit prediction of LEO (Low-Earth Orbit) satellite. ThePhysics Informed Machine learning (PIML) enables us to predict the satellite orbitwith compatible accuracy to the numerical simulation but more efficiently. Theproposed physics informed machine learning algorithm is based on modelling theorbital trajectory as Partial Differential Equation (PDE) and using deep NeuralOperator (NO) to model the dynamic of the system. We also analyse the limitationsin terms of prediction errors with respect to noisy measurements, sample frequency and computational requirements for satellite onboard operation. |
Francesco Alesiani · Makoto Takamoto · Toshio Kamiya · Daisuke Etou 🔗 |
-
|
Physically Accurate Fast Nanophotonic Simulations with Physics Informed Model and Training
(
Poster
)
>
Photonic inverse design has emerged as a powerful technique for creating non-intuitive optical devices, revolutionizing traditional design methodologies. However, the bottleneck in photonic inverse design lies in the computational cost of physically accurate 3D simulations as the high number of optimization iterations or large design foot-print becomes a limiting factor on photonic device design. Here we introduce a novel approach, utilizing a two-staged model combined with a 2D-FDFD simulator, to achieve accurate field predictions in a much smaller amount of time than 3D FDTD. The model is trained to emphasize field properties crucial for photonic device optimization such as mode overlap and transmission, enabling rapid and physically accurate simulations. Results demonstrate a remarkable speedup of up to 264.53 times compared to 3D FDTD simulations, opening new avenues for the design of complex and accurate photonic devices. |
Ahmet Onur Dasdemir · Can Dimici · Aykut Erdem · Emir Salih Magden 🔗 |
-
|
Bayesian Imaging for Radio Interferometry with Score-Based Priors
(
Poster
)
>
The inverse imaging task in radio interferometry is a key limiting factor to retrieving Bayesian uncertainties in radio astronomy in a computationally effective manner. We use a score-based prior derived from optical images of galaxies to recover images of protoplanetary disks from the DSHARP survey. We demonstrate that our method produces accurate posterior samples despite the misspecified galaxy prior. We show that our approach produces results which are competitive with existing radio interferometry imaging algorithms. |
Noé Dia · M. J. Yantovski-Barth · Alexandre Adam · Micah Bowles · Pablo Lemos · Laurence Perreault-Levasseur · Yashar Hezaveh · Anna Scaife 🔗 |
-
|
Virtual EVE: a Deep Learning Model for Solar Irradiance Prediction
(
Poster
)
>
Understanding space weather is vital for the protection of our terrestrial and space infrastructure. In order to predict space weather accurately, large amounts of data are required, particularly in the extreme ultraviolet (EUV) spectrum. An exquisite source of information for such data is provided by the Solar Dynamic Observatory (SDO), which has been gathering solar measurements for the past 13 years. However, after a malfunction in 2014 affecting the onboard Multiple EUV Grating Spectrograph A (MEGS-A) instrument, the scientific output in terms of EUV measurements has been significantly degraded. Building upon existing research, we propose to utilize deep learning for the virtualization of the defective instrument. Our architecture features a linear component and a convolutional neural network (CNN) - with EfficientNet as a backbone. The architecture utilizes as input grayscale images of the Sun at multiple frequencies - provided by the Atmospheric Imaging Assembly (AIA) - as well as solar magnetograms produced by the Helioseismic and Magnetic Imager (HMI). Our findings highlight how AIA data are all that is needed for accurate predictions of solar irradiance. Additionally, our model constitutes an improvement with respect to the state-of-the-art in the field, further promoting the idea of deep learning as a viable option for the virtualization of scientific instruments. |
Manuel Indaco · Daniel Gass · William Fawcett · Richard Galvez · Paul Wright · Andres Munoz-Jaramillo 🔗 |
-
|
High-Cadence Thermospheric Density Estimation enabled by Machine Learning on Solar Imagery
(
Poster
)
>
Accurate estimation of thermospheric density is critical for precise modeling of satellite drag forces in low Earth orbit (LEO). Improving this estimation is crucial to tasks such as state estimation, collision avoidance, and re-entry calculations. The largest source of uncertainty in determining thermospheric density is modeling the effects of space weather driven by solar activity. Current operational models rely on ground-based proxy indices which imperfectly correlate with the complexity of solar outputs. In this work, we directly incorporate NASA’s Solar Dynamics Observatory (SDO) extreme ultraviolet (EUV) spectral images into a neural thermospheric density model. We demonstrate that direct EUV imagery can replace proxies and enable predictions with much higher temporal resolution while significantly outperforming current operational models. Our method paves the way for assimilating direct EUV measurements into operational use for safer LEO satellite navigation. |
Shreshth Malik · James Edward Joseph Walsh · Giacomo Acciarini · Thomas Berger · Atilim Gunes Baydin 🔗 |
-
|
Combining astrophysical datasets with CRUMB
(
Poster
)
>
At present, the field of astronomical machine learning lacks widely-used benchmarking datasets; most research employs custom-made datasets which are often not publicly released, making comparisons between models difficult. In this paper we present CRUMB, a publicly-available image dataset of Fanaroff-Riley galaxies constructed from four |
Fiona Porter 🔗 |
-
|
Ultra Fast Transformers on FPGAs for Particle Physics Experiments
(
Poster
)
>
This work introduces a highly efficient implementation of the transformer architecture on a Field-Programmable Gate Array (FPGA) by using the hls4ml tool. Given the demonstrated effectiveness of transformer models in addressing a wide range of problems, their application in experimental triggers within particle physics becomes a subject of significant interest. In this work, we have implemented critical components of a transformer model, such as multi-head attention and softmax layers. To evaluate the effectiveness of our implementation, we have focused on a particle physics jet flavor tagging problem, employing a public dataset. We recorded latency under 2 microseconds on the Xilinx UltraScale+ FPGA, which is compatible with hardware trigger requirements at the CERN Large Hadron Collider experiments. |
Zhixing Jiang · Ziang Yin · Elham E Khoda · Vladimir Loncar · Ekaterina Govorkova · Eric Moreno · Philip Harris · Scott Hauck · Shih-chieh Hsu 🔗 |
-
|
Unleashing the Potential of Fractional Calculus in Graph Neural Networks
(
Poster
)
>
We introduce the FRactional-Order graph Neural Dynamical network (FROND), a learning framework that augments traditional graph neural ordinary differential equation (ODE) models by integrating the time-fractional Caputo derivative. Thanks to its non-local characteristic, fractional calculus enables our framework to encapsulate long-term memories during the feature-updating process, diverging from the Markovian updates inherent in conventional graph neural ODE models. This capability enhances graph representation learning.Analytically, we exhibit that over-smoothing issues are mitigated when feature updating is regulated by a diffusion process. Additionally, our framework affords a fresh dynamical system perspective to comprehend various skip or dense connections situated between GNN layers in existing literature. |
Qiyu Kang · Kai Zhao · Qinxu Ding · Feng Ji · Xuhao Li · WENFEI LIANG · Yang Song · Wee Peng Tay 🔗 |
-
|
Approximately-invariant neural networks for quantum many-body physics
(
Poster
)
>
We propose \textit{approximately} group-invariant neural networks for quantum many-body physics problems. Those tailored-made architectures are parameter-efficient, scalable, significantly outperform existing symmetry-unaware neural network architectures and are competitive with the state-of-the-art iPEPS methods as we demonstrate on a perturbed toric code toy model on a $10 \times 10$ lattice. This paves way towards studying traditionally challenging quantum spin liquid problems within interpretable neural network architectures.
|
Dominik Kufel · Jack Kemp · Norman Yao 🔗 |
-
|
Pseudotime Diffusion
(
Poster
)
>
Analysis of whole-genome sequencing data has been greatly outpaced by the experimental techniques that generate those datasets. The computational challenges associated with these analyses typically make machine learning methods more suitable over more conventional methods like dimensionality reduction, which limits the information obtainable from a dataset. In this paper, we focus on the biophysical model of RNA velocity, which yields meaningful insights into the functional trajectories of individual cells. There are many downstream applications, such as the identification of key genes driving a disease pathway. We improve the dynamical model by relaxing unrealistic assumptions and using the resulting generative process to train a diffusion model to compute pseudotime. Our probabilistic model is able to quantify the uncertainty in its pseudotime predictions. Finally, we demonstrate the efficacy of our model on a series of benchmark tasks. |
Jacob Moss · Jeremy England · Pietro Lió 🔗 |
-
|
Reinforcement Learning for Ising Model
(
Poster
)
>
The Ising Spin Glasses model, a fundamental concept in statistical mechanics and condensed matter physics, provides insights into the behavior of interacting spins within a physical system, while also being able to formulate many combinatorial optimization problems. However, solving the Ising Model for large and complex systems is computationally demanding and often infeasible using traditional methods. In this paper, we present the deterministic REINFORCE algorithm tailored for the Ising Model, enabling state-of-the-art performance through learned state transition policies. In our work, we first formulate the Ising Model with MaxCut Problem as a case study. Secondly, we propose a novel deterministic REINFORCE algorithm incorporating the Local Search approach. Finally, we evaluate our algorithm on well-known datasets and demonstrate state-of-the-art performance. |
Yichen Lu · Xiao-Yang Liu 🔗 |
-
|
Reconstructing Free Energy Using Bayesian Thermodynamic Integration
(
Poster
)
>
We introduce a new approach for the reconstruction of thermodynamic functions and phase boundaries in two-parameter statistical mechanics systems, without a requirement for knowledge of the system's energy. Our method is based on expressing the Fisher metric in terms of the posterior distributions over a space of external parameters and approximating the metric field by a Hessian of a convex function. We use the proposed approach to reconstruct the partition functions and phase diagrams of the Ising model and the exactly solvable non-equilibrium TASEP without any a priori knowledge about microscopic rules of the models, except for the selection of the range of external parameters. |
Alexander Lobashev · Mikhail Tamm 🔗 |
-
|
ELUQuant: Event-Level Uncertainty Quantification using Physics-Informed Bayesian Neural Networks with Flow approximated Posteriors - A DIS Study
(
Poster
)
>
We present a Bayesian deep learning architecture with multiplicative normalizing flows for precise uncertainty quantification (UQ) at the physics event level. This method distinguishes both types of uncertainties, aleatoric and epistemic, offering nuanced insights. When applied to Deep Inelastic Scattering (DIS) events, the model extracts kinematic variables effectively, paralleling the efficacy of recent techniques, but with event-level UQ. This UQ proves essential for tasks like event filtering and can rectify errors without the ground truth. Tests using the H1 detector at HERA suggest potential applications for the future EIC, including data monitoring and anomaly detection. Notably, our model efficiently handles large samples with low inference time. |
Cristiano Fanelli · James Giroux 🔗 |