Picture for Youssef Emad

Youssef Emad

Demystifying Synthetic Data in LLM Pre-training: A Systematic Study of Scaling Laws, Benefits, and Pitfalls

Add code
Oct 02, 2025
Viaarxiv icon

NaturalThoughts: Selecting and Distilling Reasoning Traces for General Reasoning Tasks

Add code
Jul 02, 2025
Figure 1 for NaturalThoughts: Selecting and Distilling Reasoning Traces for General Reasoning Tasks
Figure 2 for NaturalThoughts: Selecting and Distilling Reasoning Traces for General Reasoning Tasks
Figure 3 for NaturalThoughts: Selecting and Distilling Reasoning Traces for General Reasoning Tasks
Figure 4 for NaturalThoughts: Selecting and Distilling Reasoning Traces for General Reasoning Tasks
Viaarxiv icon