Picture for Maxime Heuillet

Maxime Heuillet

LLM-as-a-Judge: Toward World Models for Slate Recommendation Systems

Add code
Nov 06, 2025
Viaarxiv icon

Nested-ReFT: Efficient Reinforcement Learning for Large Language Model Fine-Tuning via Off-Policy Rollouts

Add code
Aug 13, 2025
Viaarxiv icon

Neural Active Learning Meets the Partial Monitoring Framework

Add code
May 14, 2024
Figure 1 for Neural Active Learning Meets the Partial Monitoring Framework
Figure 2 for Neural Active Learning Meets the Partial Monitoring Framework
Figure 3 for Neural Active Learning Meets the Partial Monitoring Framework
Figure 4 for Neural Active Learning Meets the Partial Monitoring Framework
Viaarxiv icon

Randomized Confidence Bounds for Stochastic Partial Monitoring

Add code
Feb 07, 2024
Figure 1 for Randomized Confidence Bounds for Stochastic Partial Monitoring
Figure 2 for Randomized Confidence Bounds for Stochastic Partial Monitoring
Figure 3 for Randomized Confidence Bounds for Stochastic Partial Monitoring
Figure 4 for Randomized Confidence Bounds for Stochastic Partial Monitoring
Viaarxiv icon