Picture for Murtaza Nikzad

Murtaza Nikzad

Forward versus Backward: Comparing Reasoning Objectives in Direct Preference Optimization

Add code
Jan 12, 2026
Viaarxiv icon