Picture for Shubhabrata Sengupta

Shubhabrata Sengupta

Demystifying Synthetic Data in LLM Pre-training: A Systematic Study of Scaling Laws, Benefits, and Pitfalls

Add code
Oct 02, 2025
Viaarxiv icon