Timezone: »
Poster
Thor: Wielding Hammers to Integrate Language Models and Automated Theorem Provers
Albert Q. Jiang · Wenda Li · Szymon Tworkowski · Konrad Czechowski · Tomasz Odrzygóźdź · Piotr Miłoś · Yuhuai Wu · Mateja Jamnik
In theorem proving, the task of selecting useful premises from a large library to unlock the proof of a given conjecture is crucially important. This presents a challenge for all theorem provers, especially the ones based on language models, due to their relative inability to reason over huge volumes of premises in text form. This paper introduces Thor, a framework integrating language models and automated theorem provers to overcome this difficulty. In Thor, a class of methods called hammers that leverage the power of automated theorem provers are used for premise selection, while all other tasks are designated to language models. Thor increases a language model's success rate on the PISA dataset from $39\%$ to $57\%$, while solving $8.2\%$ of problems neither language models nor automated theorem provers are able to solve on their own. Furthermore, with a significantly smaller computational budget, Thor can achieve a success rate on the MiniF2F dataset that is on par with the best existing methods. Thor can be instantiated for the majority of popular interactive theorem provers via a straightforward protocol we provide.
Author Information
Albert Q. Jiang (University of Cambridge)
Wenda Li (University of Cambridge)
Szymon Tworkowski (University of Warsaw)

Machine Learning Master’s student at the University of Warsaw, advised by Yuhuai Wu and Piotr Miłoś Interested in improving language models (efficient Transformers) and investigating their implications via topics like in-context learning or OOD generalization in reasoning tasks.
Konrad Czechowski (Uniwersytet Warszawski, ul. Krakowskie Przedmieście 26/28, 00-927 Warszawa, NIP 525-001-12-66)
Tomasz Odrzygóźdź (IDEAS NCBR Sp. z o.o. ul. Nowogrodzka 47A 00-695 Warszawa VAT Number: PL 7011017605)
Piotr Miłoś (Ideas NCBR, Polish Academy of Sciences)
Yuhuai Wu (Google)
Mateja Jamnik (University of Cambridge)
More from the Same Authors
-
2020 : Paper 44: CARLA Real Traffic Scenarios – novel training ground and benchmark for autonomous driving »
Błażej Osiński · Piotr Miłoś · Adam Jakubowski · Krzysztof Galias · Silviu Homoceanu -
2020 : Session A, Poster 7: Trust, But Verify: Model-Based Exploration In Sparse Reward Environments »
Konrad Czechowski -
2021 : Efficient Decompositional Rule Extraction for Deep Neural Networks »
Mateo Espinosa Zarlenga · Mateja Jamnik -
2021 : Interpretable Data Analysis for Bench-to-Bedside Research »
Zohreh Shams · Botty Dimanov · Nikola Simidjievski · Helena Andres-Terre · Paul Scherer · Urška Matjašec · Mateja Jamnik · Pietro Lió -
2021 : Off-Policy Correction For Multi-Agent Reinforcement Learning »
Michał Zawalski · Błażej Osiński · Henryk Michalewski · Piotr Miłoś -
2021 : Continuous Control With Ensemble Deep Deterministic Policy Gradients »
Piotr Januszewski · Mateusz Olko · Michał Królikowski · Jakub Swiatkowski · Marcin Andrychowicz · Łukasz Kuciński · Piotr Miłoś -
2021 : Can Network Flatness Explain the Training Speed-Generalisation Connection? »
Albert Q. Jiang · Clare Lyle · Lisa Schut · Yarin Gal -
2022 : Draft, Sketch, and Prove: Guiding Formal Theorem Provers with Informal Proofs »
Albert Jiang · Sean Welleck · Jin Peng Zhou · Timothee Lacroix · Jiacheng Liu · Wenda Li · Mateja Jamnik · Guillaume Lample · Yuhuai Wu -
2022 : Trust Your $\nabla$: Gradient-based Intervention Targeting for Causal Discovery »
Mateusz Olko · Michał Zając · Aleksandra Nowak · Nino Scherrer · Yashas Annadani · Stefan Bauer · Łukasz Kuciński · Piotr Miłoś -
2022 : Fast and Precise: Adjusting Planning Horizon with Adaptive Subgoal Search »
Michał Zawalski · Michał Tyrolski · Konrad Czechowski · Damian Stachura · Piotr Piękos · Tomasz Odrzygóźdź · Yuhuai Wu · Łukasz Kuciński · Piotr Miłoś -
2022 : The Surprising Effectiveness of Latent World Models for Continual Reinforcement Learning »
Samuel Kessler · Piotr Miłoś · Jack Parker-Holder · S Roberts -
2022 : Human Interventions in Concept Graph Networks »
Lucie Charlotte Magister · Pietro Barbiero · Dmitry Kazhdan · Federico Siciliano · Gabriele Ciravegna · Fabrizio Silvestri · Mateja Jamnik · Pietro Lió -
2022 Workshop: MATH-AI: Toward Human-Level Mathematical Reasoning »
Pan Lu · Swaroop Mishra · Sean Welleck · Yuhuai Wu · Hannaneh Hajishirzi · Percy Liang -
2022 Poster: Disentangling Transfer in Continual Reinforcement Learning »
Maciej Wolczyk · Michał Zając · Razvan Pascanu · Łukasz Kuciński · Piotr Miłoś -
2022 Poster: Autoformalization with Large Language Models »
Yuhuai Wu · Albert Q. Jiang · Wenda Li · Markus Rabe · Charles Staats · Mateja Jamnik · Christian Szegedy -
2022 Poster: Concept Embedding Models: Beyond the Accuracy-Explainability Trade-Off »
Mateo Espinosa Zarlenga · Pietro Barbiero · Gabriele Ciravegna · Giuseppe Marra · Francesco Giannini · Michelangelo Diligenti · Zohreh Shams · Frederic Precioso · Stefano Melacci · Adrian Weller · Pietro Lió · Mateja Jamnik -
2022 Poster: Insights into Pre-training via Simpler Synthetic Tasks »
Yuhuai Wu · Felix Li · Percy Liang -
2022 Poster: STaR: Bootstrapping Reasoning With Reasoning »
Eric Zelikman · Yuhuai Wu · Jesse Mu · Noah Goodman -
2022 Poster: Exploring Length Generalization in Large Language Models »
Cem Anil · Yuhuai Wu · Anders Andreassen · Aitor Lewkowycz · Vedant Misra · Vinay Ramasesh · Ambrose Slone · Guy Gur-Ari · Ethan Dyer · Behnam Neyshabur -
2022 Poster: Solving Quantitative Reasoning Problems with Language Models »
Aitor Lewkowycz · Anders Andreassen · David Dohan · Ethan Dyer · Henryk Michalewski · Vinay Ramasesh · Ambrose Slone · Cem Anil · Imanol Schlag · Theo Gutman-Solo · Yuhuai Wu · Behnam Neyshabur · Guy Gur-Ari · Vedant Misra -
2022 Poster: Path Independent Equilibrium Models Can Better Exploit Test-Time Computation »
Cem Anil · Ashwini Pokle · Kaiqu Liang · Johannes Treutlein · Yuhuai Wu · Shaojie Bai · J. Zico Kolter · Roger Grosse -
2022 Poster: Block-Recurrent Transformers »
DeLesley Hutchins · Imanol Schlag · Yuhuai Wu · Ethan Dyer · Behnam Neyshabur -
2021 : [S12] Efficient Decompositional Rule Extraction for Deep Neural Networks »
Mateo Espinosa Zarlenga · Mateja Jamnik -
2021 Poster: Subgoal Search For Complex Reasoning Tasks »
Konrad Czechowski · Tomasz Odrzygóźdź · Marek Zbysiński · Michał Zawalski · Krzysztof Olejnik · Yuhuai Wu · Łukasz Kuciński · Piotr Miłoś -
2021 Poster: Catalytic Role Of Noise And Necessity Of Inductive Biases In The Emergence Of Compositional Communication »
Łukasz Kuciński · Tomasz Korbak · Paweł Kołodziej · Piotr Miłoś -
2021 Poster: Continual World: A Robotic Benchmark For Continual Reinforcement Learning »
Maciej Wołczyk · Michał Zając · Razvan Pascanu · Łukasz Kuciński · Piotr Miłoś -
2020 : Poster Session A: 3:00 AM - 4:30 AM PST »
Taras Khakhulin · Ravichandra Addanki · Jinhwi Lee · Jungtaek Kim · Piotr Januszewski · Konrad Czechowski · Francesco Landolfi · Lovro Vrček · Oren Neumann · Claudius Gros · Betty Fabre · Lukas Faber · Lucas Anquetil · Alberto Franzin · Tommaso Bendinelli · Sergey Bartunov -
2019 : Coffee + Posters »
Changhao Chen · Nils Gählert · Edouard Leurent · Johannes Lehner · Apratim Bhattacharyya · Harkirat Singh Behl · Teck Yian Lim · Shiho Kim · Jelena Novosel · Błażej Osiński · Arindam Das · Ruobing Shen · Jeffrey Hawke · Joachim Sicking · Babak Shahian Jahromi · Theja Tulabandhula · Claudio Michaelis · Evgenia Rusak · WENHANG BAO · Hazem Rashed · JP Chen · Amin Ansari · Jaekwang Cha · Mohamed Zahran · Daniele Reda · Jinhyuk Kim · Kim Dohyun · Ho Suk · Junekyo Jhung · Alexander Kister · Matthias Fahrland · Adam Jakubowski · Piotr Miłoś · Jean Mercat · Bruno Arsenali · Silviu Homoceanu · Xiao-Yang Liu · Philip Torr · Ahmad El Sallab · Ibrahim Sobh · Anurag Arnab · Krzysztof Galias