Picture for Maxime Heuillet

Maxime Heuillet

Nested-ReFT: Efficient Reinforcement Learning for Large Language Model Fine-Tuning via Off-Policy Rollouts

Add code
Aug 13, 2025
Viaarxiv icon

Neural Active Learning Meets the Partial Monitoring Framework

Add code
May 14, 2024
Viaarxiv icon

Randomized Confidence Bounds for Stochastic Partial Monitoring

Add code
Feb 07, 2024
Figure 1 for Randomized Confidence Bounds for Stochastic Partial Monitoring
Figure 2 for Randomized Confidence Bounds for Stochastic Partial Monitoring
Figure 3 for Randomized Confidence Bounds for Stochastic Partial Monitoring
Figure 4 for Randomized Confidence Bounds for Stochastic Partial Monitoring
Viaarxiv icon