Picture for Zhuodong Luo

Zhuodong Luo

DT2IT-MRM: Debiased Preference Construction and Iterative Training for Multimodal Reward Modeling

Add code
Apr 21, 2026
Viaarxiv icon