Pre-DPO: Improving Data Utilization in Direct Preference Optimization Using a Guiding Reference Model

Add code
Apr 22, 2025
Figure 1 for Pre-DPO: Improving Data Utilization in Direct Preference Optimization Using a Guiding Reference Model
Figure 2 for Pre-DPO: Improving Data Utilization in Direct Preference Optimization Using a Guiding Reference Model
Figure 3 for Pre-DPO: Improving Data Utilization in Direct Preference Optimization Using a Guiding Reference Model
Figure 4 for Pre-DPO: Improving Data Utilization in Direct Preference Optimization Using a Guiding Reference Model

Share this with someone who'll enjoy it:

View paper onarxiv icon

Share this with someone who'll enjoy it: