Picture for Haoqi Yang

Haoqi Yang

nncase: An End-to-End Compiler for Efficient LLM Deployment on Heterogeneous Storage Architectures

Add code
Dec 25, 2025
Viaarxiv icon

Faster MoE LLM Inference for Extremely Large Models

Add code
May 06, 2025
Figure 1 for Faster MoE LLM Inference for Extremely Large Models
Figure 2 for Faster MoE LLM Inference for Extremely Large Models
Figure 3 for Faster MoE LLM Inference for Extremely Large Models
Figure 4 for Faster MoE LLM Inference for Extremely Large Models
Viaarxiv icon