Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Emmanuel Anaya Gonzalez

Constrained Sampling for Language Models Should Be Easy: An MCMC Perspective

Jun 06, 2025

Emmanuel Anaya Gonzalez, Sairam Vaidya, Kanghee Park, Ruyi Ji, Taylor Berg-Kirkpatrick, Loris D'Antoni

Abstract:Constrained decoding enables Language Models (LMs) to produce samples that provably satisfy hard constraints. However, existing constrained-decoding approaches often distort the underlying model distribution, a limitation that is especially problematic in applications like program fuzzing, where one wants to generate diverse and valid program inputs for testing purposes. We propose a new constrained sampling framework based on Markov Chain Monte Carlo (MCMC) that simultaneously satisfies three core desiderata: constraint satisfying (every sample satisfies the constraint), monotonically converging (the sampling process converges to the true conditional distribution), and efficient (high-quality samples emerge in few steps). Our method constructs a proposal distribution over valid outputs and applies a Metropolis-Hastings acceptance criterion based on the LM's likelihood, ensuring principled and efficient exploration of the constrained space. Empirically, our sampler outperforms existing methods on both synthetic benchmarks and real-world program fuzzing tasks.

Via

Access Paper or Ask Questions

Laurel: Generating Dafny Assertions Using Large Language Models

May 27, 2024

Eric Mugnier, Emmanuel Anaya Gonzalez, Ranjit Jhala, Nadia Polikarpova, Yuanyuan Zhou

Figure 1 for Laurel: Generating Dafny Assertions Using Large Language Models

Figure 2 for Laurel: Generating Dafny Assertions Using Large Language Models

Figure 3 for Laurel: Generating Dafny Assertions Using Large Language Models

Figure 4 for Laurel: Generating Dafny Assertions Using Large Language Models

Abstract:Dafny is a popular verification language, which automates proofs by outsourcing them to an SMT solver. This automation is not perfect, however, and the solver often requires guidance in the form of helper assertions creating a burden for the proof engineer. In this paper, we propose Laurel, a tool that uses large language models (LLMs) to automatically generate helper assertions for Dafny programs. To improve the success rate of LLMs in this task, we design two domain-specific prompting techniques. First, we help the LLM determine the location of the missing assertion by analyzing the verifier's error message and inserting an assertion placeholder at that location. Second, we provide the LLM with example assertions from the same codebase, which we select based on a new lemma similarity metric. We evaluate our techniques on a dataset of helper assertions we extracted from three real-world Dafny codebases. Our evaluation shows that Laurel is able to generate over 50% of the required helper assertions given only a few attempts, making LLMs a usable and affordable tool to further automate practical program verification.

* 10 pages, under review

Via

Access Paper or Ask Questions

HYSYNTH: Context-Free LLM Approximation for Guiding Program Synthesis

May 24, 2024

Shraddha Barke, Emmanuel Anaya Gonzalez, Saketh Ram Kasibatla, Taylor Berg-Kirkpatrick, Nadia Polikarpova

Abstract:Many structured prediction and reasoning tasks can be framed as program synthesis problems, where the goal is to generate a program in a domain-specific language (DSL) that transforms input data into the desired output. Unfortunately, purely neural approaches, such as large language models (LLMs), often fail to produce fully correct programs in unfamiliar DSLs, while purely symbolic methods based on combinatorial search scale poorly to complex problems. Motivated by these limitations, we introduce a hybrid approach, where LLM completions for a given task are used to learn a task-specific, context-free surrogate model, which is then used to guide program synthesis. We evaluate this hybrid approach on three domains, and show that it outperforms both unguided search and direct sampling from LLMs, as well as existing program synthesizers.

Via

Access Paper or Ask Questions