Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Kedar S. Namjoshi

Can Large Language Models Learn Formal Logic? A Data-Driven Training and Evaluation Framework

Apr 28, 2025

Yuan Xia, Akanksha Atrey, Fadoua Khmaissia, Kedar S. Namjoshi

Abstract:This paper investigates the logical reasoning capabilities of large language models (LLMs). For a precisely defined yet tractable formulation, we choose the conceptually simple but technically complex task of constructing proofs in Boolean logic. A trained LLM receives as input a set of assumptions and a goal, and produces as output a proof that formally derives the goal from the assumptions. Incorrect proofs are caught by an automated proof checker. A critical obstacle for training is the scarcity of real-world proofs. We propose an efficient, randomized procedure for synthesizing valid proofs and introduce Template Transformation, a data augmentation technique that enhances the model's ability to handle complex logical expressions. The central evaluation question is whether an LLM has indeed learned to reason. We propose tests to measure the reasoning ability of a black-box LLM. By these measures, experiments demonstrate strong reasoning capabilities for assertions with short proofs, which decline with proof complexity. Notably, template transformation improves accuracy even for smaller models, suggesting its effectiveness across model scales.

Via

Access Paper or Ask Questions

The Resh Programming Language for Multirobot Orchestration

Mar 25, 2021

Martin Carroll, Kedar S. Namjoshi, Itai Segall

Figure 1 for The Resh Programming Language for Multirobot Orchestration

Figure 2 for The Resh Programming Language for Multirobot Orchestration

Figure 3 for The Resh Programming Language for Multirobot Orchestration

Abstract:This paper describes Resh, a new, statically typed, interpreted programming language and associated runtime for orchestrating multirobot systems. The main features of Resh are: (1) It offloads much of the tedious work of programming such systems away from the programmer and into the language runtime; (2) It is based on a small set of temporal and locational operators; and (3) It is not restricted to specific robot types or tasks. The Resh runtime consists of three engines that collaborate to run a Resh program using the available robots in their current environment. This paper describes both Resh and its runtime and gives examples of its use.

* Accepted for publication at ICRA'21

Via

Access Paper or Ask Questions