Picture for Matthieu Zimmer

Matthieu Zimmer

Risk-Controlled Lean-as-Judge for Natural-Language Mathematical Reasoning

Add code
May 27, 2026
Viaarxiv icon

The Model Knows, the Decoder Finds: Future Value Guided Particle Power Sampling

Add code
May 04, 2026
Viaarxiv icon

The $\mathbf{Y}$-Combinator for LLMs: Solving Long-Context Rot with $λ$-Calculus

Add code
Mar 20, 2026
Viaarxiv icon

Multi-Task GRPO: Reliable LLM Reasoning Across Tasks

Add code
Feb 05, 2026
Viaarxiv icon

Scalable Power Sampling: Unlocking Efficient, Training-Free Reasoning for LLMs via Distribution Sharpening

Add code
Jan 29, 2026
Viaarxiv icon

Tree-OPO: Off-policy Monte Carlo Tree-Guided Advantage Optimization for Multistep Reasoning

Add code
Sep 11, 2025
Figure 1 for Tree-OPO: Off-policy Monte Carlo Tree-Guided Advantage Optimization for Multistep Reasoning
Figure 2 for Tree-OPO: Off-policy Monte Carlo Tree-Guided Advantage Optimization for Multistep Reasoning
Figure 3 for Tree-OPO: Off-policy Monte Carlo Tree-Guided Advantage Optimization for Multistep Reasoning
Figure 4 for Tree-OPO: Off-policy Monte Carlo Tree-Guided Advantage Optimization for Multistep Reasoning
Viaarxiv icon

Bourbaki: Self-Generated and Goal-Conditioned MDPs for Theorem Proving

Add code
Jul 03, 2025
Viaarxiv icon

Almost Surely Safe Alignment of Large Language Models at Inference-Time

Add code
Feb 03, 2025
Viaarxiv icon

Mixture of Attentions For Speculative Decoding

Add code
Oct 04, 2024
Viaarxiv icon

ROS-LLM: A ROS framework for embodied AI with task feedback and structured reasoning

Add code
Jun 28, 2024
Figure 1 for ROS-LLM: A ROS framework for embodied AI with task feedback and structured reasoning
Figure 2 for ROS-LLM: A ROS framework for embodied AI with task feedback and structured reasoning
Figure 3 for ROS-LLM: A ROS framework for embodied AI with task feedback and structured reasoning
Figure 4 for ROS-LLM: A ROS framework for embodied AI with task feedback and structured reasoning
Viaarxiv icon