Skip to yearly menu bar Skip to main content


Poster
in
Workshop: AI for Accelerated Materials Design (AI4Mat-2023)

MatKG-2: Unveiling precise material science ontology through autonomous committees of LLMs

Vineeth Venugopal · Elsa Olivetti

Keywords: [ Natural Language Processing ] [ AI ] [ Materials Informatics ] [ Large language models ] [ materials science ]


Abstract:

This paper introduces MatKG-2, a Material Science knowledge graph autonomously generated through a Large Language Model (LLM) driven pipeline. Building on the groundwork of MatKG, MatKG-2 employs a novel 'committee of large language models' approach to extract and classify knowledge triples with an established ontology. Unlike the previous version, which relied on statistical co-occurrence, MatKG-2 offers more nuanced, ontology-based relationships. Using open LLMs such as Llama2 7b and Bloom 1b/7b, the study offers reproducibility and broad community engagement. By using 4-bit and 8-bit quantized versions for fine-tuning and inference, MatKG-2 is also more computationally tractable and therefore compatible with most commercially available GPUs. Our work highlights the potential of MatKG-2 in supporting Material Science data infrastructure and in contributing to the semantic web.

Chat is not available.