Picture for Tetsuro Morimura

Tetsuro Morimura

Evaluation of Best-of-N Sampling Strategies for Language Model Alignment

Add code
Feb 18, 2025
Viaarxiv icon

Theoretical Guarantees for Minimum Bayes Risk Decoding

Add code
Feb 18, 2025
Viaarxiv icon

Reinforcement Learning for Edit-Based Non-Autoregressive Neural Machine Translation

Add code
May 02, 2024
Figure 1 for Reinforcement Learning for Edit-Based Non-Autoregressive Neural Machine Translation
Figure 2 for Reinforcement Learning for Edit-Based Non-Autoregressive Neural Machine Translation
Figure 3 for Reinforcement Learning for Edit-Based Non-Autoregressive Neural Machine Translation
Figure 4 for Reinforcement Learning for Edit-Based Non-Autoregressive Neural Machine Translation
Viaarxiv icon

Filtered Direct Preference Optimization

Add code
Apr 23, 2024
Figure 1 for Filtered Direct Preference Optimization
Figure 2 for Filtered Direct Preference Optimization
Figure 3 for Filtered Direct Preference Optimization
Figure 4 for Filtered Direct Preference Optimization
Viaarxiv icon

Regularized Best-of-N Sampling to Mitigate Reward Hacking for Language Model Alignment

Add code
Apr 05, 2024
Viaarxiv icon

On the True Distribution Approximation of Minimum Bayes-Risk Decoding

Add code
Mar 31, 2024
Viaarxiv icon

Return-Aligned Decision Transformer

Add code
Feb 06, 2024
Viaarxiv icon

Generating Diverse and High-Quality Texts by Minimum Bayes Risk Decoding

Add code
Jan 10, 2024
Figure 1 for Generating Diverse and High-Quality Texts by Minimum Bayes Risk Decoding
Figure 2 for Generating Diverse and High-Quality Texts by Minimum Bayes Risk Decoding
Figure 3 for Generating Diverse and High-Quality Texts by Minimum Bayes Risk Decoding
Figure 4 for Generating Diverse and High-Quality Texts by Minimum Bayes Risk Decoding
Viaarxiv icon

Model-Based Minimum Bayes Risk Decoding

Add code
Nov 09, 2023
Viaarxiv icon

Policy Gradient with Kernel Quadrature

Add code
Oct 23, 2023
Figure 1 for Policy Gradient with Kernel Quadrature
Figure 2 for Policy Gradient with Kernel Quadrature
Viaarxiv icon