Theorem proving is an important challenge for large language models (LLMs), as formal proofs can be checked rigorously by proof assistants such as Lean, leaving no room for hallucination. Existing LLM-based provers try to prove theorems in a fully autonomous mode without human intervention. In this mode, they struggle with novel and challenging theorems, for which human insights may be critical. In this paper, we explore LLMs as copilots that assist humans in proving theorems. We introduce Lean Copilot, a framework for running LLM inference in Lean. It enables programmers to build various LLM-based proof automation tools that integrate seamlessly into the workflow of Lean users. Using Lean Copilot, we build tools for suggesting proof steps (tactic suggestion), completing intermediate proof goals (proof search), and selecting relevant premises (premise selection) using LLMs. Users can use our pretrained models or bring their own ones that run either locally (with or without GPUs) or on the cloud. Experimental results demonstrate the effectiveness of our method in assisting humans and automating theorem proving process compared to existing rule-based proof automation in Lean. We open source all codes under a permissive MIT license to facilitate further research.
Large language models (LLMs) have shown promise in proving formal theorems using proof assistants such as Lean. However, existing methods are difficult to reproduce or build on, due to private code, data, and large compute requirements. This has created substantial barriers to research on machine learning methods for theorem proving. This paper removes these barriers by introducing LeanDojo: an open-source Lean playground consisting of toolkits, data, models, and benchmarks. LeanDojo extracts data from Lean and enables interaction with the proof environment programmatically. It contains fine-grained annotations of premises in proofs, providing valuable data for premise selection: a key bottleneck in theorem proving. Using this data, we develop ReProver (Retrieval-Augmented Prover): the first LLM-based prover that is augmented with retrieval for selecting premises from a vast math library. It is inexpensive and needs only one GPU week of training. Our retriever leverages LeanDojo's program analysis capability to identify accessible premises and hard negative examples, which makes retrieval much more effective. Furthermore, we construct a new benchmark consisting of 96,962 theorems and proofs extracted from Lean's math library. It features challenging data split requiring the prover to generalize to theorems relying on novel premises that are never used in training. We use this benchmark for training and evaluation, and experimental results demonstrate the effectiveness of ReProver over non-retrieval baselines and GPT-4. We thus provide the first set of open-source LLM-based theorem provers without any proprietary datasets and release it under a permissive MIT license to facilitate further research.
The development of the industrial Internet of Things (IoT) calls for higher spectrum efficiency (SE). Faster than Nyquist (FTN) and non-orthogonal multiple access (NOMA) are both promising paradigms to improve the SE without any extra spectrum resources required. The combination of FTN and NOMA is an interesting issue and has been focused on recently. In the NOMA technology, user pairing and power allocation are key algorithms determining system capacity. This paper first proposes a joint user pairing and power allocation algorithm for the FTN-based single-carrier (SC) NOMA system. Then, the FTN-based multiple-input-multiple-output (MIMO) NOMA is studied and a dynamic user pairing and power allocation scheme is presented. In both scenarios, the maximum available sum rate (ASR) is the target. While based on the fairness principle, the user's SE in the NOMA system is guaranteed to be no less than that in the OMA system. Simulation results show the advantage of the FTN-based NOMA with the proposed scheme in ASR and quality of service (QoS) performance. As far as we know, this paper is the first solution to the issue of user pairing and power allocation in FTN-based NOMA, which proves the great advantage of the combination of these two state-of-the-art technologies.
Faster than Nyquist (FTN) and non-orthogonal multiple access (NOMA) are two promising paradigms to improve the spectrum efficiency (SE) of communication systems without any extra spectrum resources required. The combination of FTN signaling and NOMA technology is an interesting attempt and has recently been focused on by researchers. In the NOMA technology, user pairing and power allocation are key algorithms that can determine the system capacity. This paper proposes a joint optimal user pairing and power allocation algorithm for the FTN-based single-carrier (SC) NOMA system, considering user fairness. Then, the FTN-based multiple-input-multiple-output (MIMO) NOMA system is studied and a dynamic user pairing and power allocation scheme is presented. In both scenarios, the constraint for user fairness guarantees that the user's SE in the NOMA system is no less than that in the OMA system. Afterward, the performance of the proposed scheme in achievable sum rate (ASR) and quality of service (QoS) is derived and verified. Simulation results show that the proposed user pairing and power allocation can achieve significantly higher ASRs and lower outage probabilities beyond the conventional orthogonal multiple access (OMA) system and the NOMA system with Nyquist-criterion transmission and random user pairing. As far as we know, this paper is the first solution to the issue of user pairing and power allocation for FTN-based NOMA systems.