Skip to yearly menu bar Skip to main content

Workshop: MATH-AI: The 3rd Workshop on Mathematical Reasoning and AI

Towards Large Language Models as Copilots for Theorem Proving in Lean

Peiyang Song · Kaiyu Yang · Animashree Anandkumar

Keywords: [ automated reasoning ] [ theorem proving ] [ Lean ] [ proof assistants ] [ LLMs ]


Theorem proving is an important challenge for large language models (LLMs), as formal proofs can be checked rigorously by proof assistants such as Lean, leaving no room for hallucination. Existing LLM-based provers try to prove theorems in a fully autonomous mode without human intervention. In this mode, they struggle with novel and challenging theorems, for which human insights may be critical. In this paper, we explore LLMs as copilots that assist humans in proving theorems. We introduce Lean Copilot, a framework for running neural network inference in Lean. It enables programmers to build various LLM-based proof automation tools that integrate seamlessly into the workflow of Lean users. Using Lean Copilot, we build tools for suggesting proof steps and completing intermediate proof goals using LLMs. Experimental results demonstrate the effectiveness of our method in assisting humans compared to existing rule-based proof automation in Lean.

Chat is not available.