Picture for Chunqiang Tang

Chunqiang Tang

Scaling Multi-Node Mixture-of-Experts Inference Using Expert Activation Patterns

Add code
Apr 25, 2026
Viaarxiv icon

Training LLMs with Fault Tolerant HSDP on 100,000 GPUs

Add code
Jan 30, 2026
Viaarxiv icon

The Llama 4 Herd: Architecture, Training, Evaluation, and Deployment Notes

Add code
Jan 15, 2026
Viaarxiv icon