Picture for Sida Zhao

Sida Zhao

Taming the Chaos: Coordinated Autoscaling for Heterogeneous and Disaggregated LLM Inference

Add code
Aug 27, 2025
Viaarxiv icon

Understanding Stragglers in Large Model Training Using What-if Analysis

Add code
May 09, 2025
Viaarxiv icon

MegaScale: Scaling Large Language Model Training to More Than 10,000 GPUs

Add code
Feb 23, 2024
Figure 1 for MegaScale: Scaling Large Language Model Training to More Than 10,000 GPUs
Figure 2 for MegaScale: Scaling Large Language Model Training to More Than 10,000 GPUs
Figure 3 for MegaScale: Scaling Large Language Model Training to More Than 10,000 GPUs
Figure 4 for MegaScale: Scaling Large Language Model Training to More Than 10,000 GPUs
Viaarxiv icon