Picture for Jieying Ye

Jieying Ye

Decouple Searching from Training: Scaling Data Mixing via Model Merging for Large Language Model Pre-training

Add code
Jan 31, 2026
Viaarxiv icon