Skip to yearly menu bar Skip to main content


Poster
in
Workshop: AI for Accelerated Materials Design (AI4Mat-2023)

Tree-based Quantile Active Learning for automated discovery of MOFs

Ashna Jose · Emilie Devijver · Roberta Poloni · ValĂ©rie Monbet · Noel Jakse

Keywords: [ Query-based Learning ] [ Quantile Active Learning ] [ Metal Organic Frameworks ] [ Automated materials discovery ]


Abstract: Metal-organic frameworks (MOFs), formed through coordination bonds between metal ions and organic ligands, are promising materials for efficient gas adsorption, due to their ultrahigh porosity, chemical tunability and large surface area. Because over a hundred thousand hypothetical MOFs have been reported to date, brute force discovery of the best performer MOF for a specific application is not feasible. Recently, predicting material properties using machine learning algorithms has played a crucial role in scanning large databases, but this often requires large labeled training sets, which is not always available. To address this, active learning, where the training set is constructed iteratively by querying only informative labels, is necessary. Moreover, in most cases, a very specific range of the property of interest is desirable. We employ a novel regression tree-based quantile active learning algorithm that uses partitions of a regression tree to select new samples to be added to the training set. It thereby limits the sample size while maximizing the prediction quality over a quantile of interest. Tests on benchmark MOF data sets demonstrate that focusing on a specific quantile is effective in learning regression models to predict electronic band gaps and CO$_2$ adsorption in the regions of interest, from a very limited labeled data set.

Chat is not available.