Picture for Sean Welleck

Sean Welleck

Reasoning over mathematical objects: on-policy reward modeling and test time aggregation

Add code
Mar 19, 2026
Viaarxiv icon

Argument Reconstruction as Supervision for Critical Thinking in LLMs

Add code
Mar 18, 2026
Viaarxiv icon

Mind the Sim2Real Gap in User Simulation for Agentic Tasks

Add code
Mar 11, 2026
Viaarxiv icon

GradAlign: Gradient-Aligned Data Selection for LLM Reinforcement Learning

Add code
Feb 25, 2026
Viaarxiv icon

Reasoning with Latent Tokens in Diffusion Language Models

Add code
Feb 03, 2026
Viaarxiv icon

Propose, Solve, Verify: Self-Play Through Formal Verification

Add code
Dec 20, 2025
Figure 1 for Propose, Solve, Verify: Self-Play Through Formal Verification
Figure 2 for Propose, Solve, Verify: Self-Play Through Formal Verification
Figure 3 for Propose, Solve, Verify: Self-Play Through Formal Verification
Figure 4 for Propose, Solve, Verify: Self-Play Through Formal Verification
Viaarxiv icon

OptimalThinkingBench: Evaluating Over and Underthinking in LLMs

Add code
Aug 18, 2025
Figure 1 for OptimalThinkingBench: Evaluating Over and Underthinking in LLMs
Figure 2 for OptimalThinkingBench: Evaluating Over and Underthinking in LLMs
Figure 3 for OptimalThinkingBench: Evaluating Over and Underthinking in LLMs
Figure 4 for OptimalThinkingBench: Evaluating Over and Underthinking in LLMs
Viaarxiv icon

Premise Selection for a Lean Hammer

Add code
Jun 09, 2025
Viaarxiv icon

The CoT Encyclopedia: Analyzing, Predicting, and Controlling how a Reasoning Model will Think

Add code
May 15, 2025
Figure 1 for The CoT Encyclopedia: Analyzing, Predicting, and Controlling how a Reasoning Model will Think
Figure 2 for The CoT Encyclopedia: Analyzing, Predicting, and Controlling how a Reasoning Model will Think
Figure 3 for The CoT Encyclopedia: Analyzing, Predicting, and Controlling how a Reasoning Model will Think
Figure 4 for The CoT Encyclopedia: Analyzing, Predicting, and Controlling how a Reasoning Model will Think
Viaarxiv icon

Scaling Evaluation-time Compute with Reasoning Models as Process Evaluators

Add code
Mar 25, 2025
Viaarxiv icon