煉金術：透過符號變異來增強定理證明能力

摘要

撰寫正式證明即使對經驗豐富的專家來說也是具有挑戰性的。最近神經定理證明（NTP）的進展顯示了加速這一過程的潛力。然而，網絡上可用的正式語料庫相對於一般文本來說有限，這為NTP帶來了重大的數據稀缺挑戰。為了解決這個問題，本研究提出了Alchemy，一個通用的數據合成框架，通過符號變異構建正式定理。具體來說，對於Mathlib中的每個候選定理，我們識別所有可調用的定理，可以對其進行重寫或應用。隨後，我們通過用等價形式或前提替換陳述中的相應術語來變異候選定理。因此，我們的方法將Mathlib中的定理數量增加了一個數量級，從110k增加到6M。此外，我們對這個擴充語料庫進行了持續的預訓練和監督微調以供大型語言模型使用。實驗結果顯示了我們方法的有效性，在Leandojo基準測試中實現了5%的絕對性能提升。此外，我們的合成數據在超出分佈的miniF2F基準測試中實現了2.5%的絕對性能增益。為了提供更多見解，我們對合成數據組成和訓練範式進行了全面分析，為發展強大的定理證明器提供了有價值的指導。

English

Formal proofs are challenging to write even for experienced experts. Recent progress in Neural Theorem Proving (NTP) shows promise in expediting this process. However, the formal corpora available on the Internet are limited compared to the general text, posing a significant data scarcity challenge for NTP. To address this issue, this work proposes Alchemy, a general framework for data synthesis that constructs formal theorems through symbolic mutation. Specifically, for each candidate theorem in Mathlib, we identify all invocable theorems that can be used to rewrite or apply to it. Subsequently, we mutate the candidate theorem by replacing the corresponding term in the statement with its equivalent form or antecedent. As a result, our method increases the number of theorems in Mathlib by an order of magnitude, from 110k to 6M. Furthermore, we perform continual pretraining and supervised finetuning on this augmented corpus for large language models. Experimental results demonstrate the effectiveness of our approach, achieving a 5% absolute performance improvement on Leandojo benchmark. Additionally, our synthetic data achieve a 2.5% absolute performance gain on the out-of-distribution miniF2F benchmark. To provide further insights, we conduct a comprehensive analysis of synthetic data composition and the training paradigm, offering valuable guidance for developing a strong theorem prover.

煉金術：透過符號變異來增強定理證明能力

Alchemy: Amplifying Theorem-Proving Capability through Symbolic Mutation

摘要

Summary

Support

Support