Skip to yearly menu bar Skip to main content


Poster
in
Workshop: AI for Accelerated Materials Design (AI4Mat-2023)

Data Distillation for Neural Network Potentials toward Foundational Dataset

Gang Seob Jung · Sangkeun Lee · Jong Choi

Keywords: [ Enhanced Sampling ] [ Neural Network Potential ] [ Data Distillation ] [ Active Learning ] [ knowledge transfer ]


Abstract:

Machine learning (ML) techniques and atomistic modeling have rapidly transformed materials design and discovery. Specifically, generative models can swiftly propose promising materials for targeted applications. However, the predicted properties of materials through the generative models often do not match with calculated properties through ab initio calculations. This discrepancy can arise because the generated coordinates are not fully relaxed, whereas the many properties are derived from relaxed structures. Neural network-based potentials (NNPs) can expedite the process by providing relaxed structures from the initially generated ones. Nevertheless, acquiring data to train NNPs for this purpose can be extremely challenging as it needs to encompass previously unknown structures. This study utilized extended ensemble molecular dynamics (MD) to secure a broad range of liquid- and solid-phase configurations in metallic systems. Then, we could significantly reduce them through active learning without losing much accuracy. We found that the NNP trained from the distilled data could predict different energy-minimized closed-pack crystal structures even though those structures were not explicitly part of the initial data. Furthermore, the data can be translated to another metallic system without repeating the sampling and distillation processes. Our approach to data acquisition and distillation has demonstrated the potential to expedite the NNP development and enhance materials design and discovery by integrating generative models.

Chat is not available.