Skip to yearly menu bar Skip to main content

Workshop: NeurIPS 2023 Workshop: Machine Learning and the Physical Sciences

GAMMA: Galactic Attributes of Mass, Metallicity, and Age Dataset

Tobias Buck · Ufuk Çakır


We introduce the GAMMA (Galactic Attributes of Mass, Metallicity, and Age) dataset, a comprehensive collection of galaxy data tailored for Machine Learning applications. This dataset offers detailed 2D maps and 3D cubes of 11 727 galaxies, capturing essential attributes: stellar age, metallicity, and mass. Together with the dataset we publish our code to extract any other stellar or gaseous property from the raw simulation suite to extend the dataset beyond these initial properties, ensuring versatility for various computational tasks. Ideal for feature extraction, clustering, and regression tasks, GAMMA offers a unique lens for exploring galactic structures through computational methods and is a bridge between astrophysical simulations and the field of scientific machine learning (ML). As a first benchmark, we apply Principal Component Analysis (PCA) on this dataset. We find that PCA effectively captures the key morphological features of galaxies with a small number of components. We achieve a dimensionality reduction by a factor of ~200 (~3 650) for 2D images (3D cubes) with a reconstruction accuracy below 5%.

Chat is not available.