Skip to yearly menu bar Skip to main content


Poster

PutnamBench: Evaluating Neural Theorem-Provers on the Putnam Mathematical Competition

George Tsoukalas · Jasper Lee · John Jennings · Jimmy Xin · Michelle Ding · Michael Jennings · Amitayush Thakur · Swarat Chaudhuri

West Ballroom A-D #5102
[ ]
Fri 13 Dec 11 a.m. PST — 2 p.m. PST

Abstract:

We present PutnamBench, a new multilingual benchmark for evaluating the ability of neural theorem-provers to solve competition mathematics problems. PutnamBench consists of 1337 hand-constructed formalizations of 514 theorems sourced from the William Lowell Putnam Mathematical Competition, the premier undergraduate-level mathematics competition in North America. All the theorems have formalizations in Lean 4 and Isabelle; a substantial subset also has Coq formalizations. Proving the theorems requires significant problem-solving ability and proficiency in a broad range of topics taught in undergraduate mathematics courses. We use PutnamBench to evaluate several established neural and symbolic theorem-provers. These approaches can only solve a handful of the PutnamBench problems, establishing the benchmark as a difficult open challenge for research on neural theorem-proving. PutnamBench is available at https://github.com/trishullab/PUTNAM.

Live content is unavailable. Log in and register to view live content