Picture for Kaito Ariu

Kaito Ariu

Policy Testing in Markov Decision Processes

Add code
May 21, 2025
Viaarxiv icon

Evaluation of Best-of-N Sampling Strategies for Language Model Alignment

Add code
Feb 18, 2025
Viaarxiv icon

Theoretical Guarantees for Minimum Bayes Risk Decoding

Add code
Feb 18, 2025
Viaarxiv icon

The Power of Perturbation under Sampling in Solving Extensive-Form Games

Add code
Jan 28, 2025
Viaarxiv icon

Last Iterate Convergence in Monotone Mean Field Games

Add code
Oct 07, 2024
Viaarxiv icon

Matroid Semi-Bandits in Sublinear Time

Add code
May 28, 2024
Viaarxiv icon

Filtered Direct Preference Optimization

Add code
Apr 23, 2024
Figure 1 for Filtered Direct Preference Optimization
Figure 2 for Filtered Direct Preference Optimization
Figure 3 for Filtered Direct Preference Optimization
Figure 4 for Filtered Direct Preference Optimization
Viaarxiv icon

Regularized Best-of-N Sampling to Mitigate Reward Hacking for Language Model Alignment

Add code
Apr 05, 2024
Viaarxiv icon

Return-Aligned Decision Transformer

Add code
Feb 06, 2024
Viaarxiv icon

Hyperparameter-Free Approach for Faster Minimum Bayes Risk Decoding

Add code
Jan 05, 2024
Viaarxiv icon