Picture for Soowon Oh

Soowon Oh

PerMix-RLVR: Preserving Persona Expressivity under Verifiable-Reward Alignment

Add code
Apr 10, 2026
Viaarxiv icon

mSFT: Addressing Dataset Mixtures Overfitting Heterogeneously in Multi-task SFT

Add code
Mar 25, 2026
Viaarxiv icon

mSFT: Addressing Dataset Mixtures Overfiting Heterogeneously in Multi-task SFT

Add code
Mar 23, 2026
Viaarxiv icon