Picture for Zhangyu Chen

Zhangyu Chen

Serving Large Language Models on Huawei CloudMatrix384

Add code
Jun 15, 2025
Figure 1 for Serving Large Language Models on Huawei CloudMatrix384
Figure 2 for Serving Large Language Models on Huawei CloudMatrix384
Figure 3 for Serving Large Language Models on Huawei CloudMatrix384
Figure 4 for Serving Large Language Models on Huawei CloudMatrix384
Viaarxiv icon

Injecting Adrenaline into LLM Serving: Boosting Resource Utilization and Throughput via Attention Disaggregation

Add code
Mar 26, 2025
Viaarxiv icon