Picture for Fangzheng Miao

Fangzheng Miao

Multi-Scale Dequant: Eliminating Dequantization Bottleneck via Activation Decomposition for Efficient LLM Inference

Add code
May 13, 2026
Viaarxiv icon

Serving Large Language Models on Huawei CloudMatrix384

Add code
Jun 15, 2025
Figure 1 for Serving Large Language Models on Huawei CloudMatrix384
Figure 2 for Serving Large Language Models on Huawei CloudMatrix384
Figure 3 for Serving Large Language Models on Huawei CloudMatrix384
Figure 4 for Serving Large Language Models on Huawei CloudMatrix384
Viaarxiv icon