Picture for Cesar A. Stuardo

Cesar A. Stuardo

MegaScale-Infer: Serving Mixture-of-Experts at Scale with Disaggregated Expert Parallelism

Add code
Apr 03, 2025
Figure 1 for MegaScale-Infer: Serving Mixture-of-Experts at Scale with Disaggregated Expert Parallelism
Figure 2 for MegaScale-Infer: Serving Mixture-of-Experts at Scale with Disaggregated Expert Parallelism
Figure 3 for MegaScale-Infer: Serving Mixture-of-Experts at Scale with Disaggregated Expert Parallelism
Figure 4 for MegaScale-Infer: Serving Mixture-of-Experts at Scale with Disaggregated Expert Parallelism
Viaarxiv icon