Picture for Madhu Kumar

Madhu Kumar

How Far Can Disaggregation Go? A Design-Space Exploration of Attention-FFN Disaggregation for Efficient MoE LLM Serving

Add code
May 27, 2026
Viaarxiv icon

Understanding and Optimizing Multi-Stage AI Inference Pipelines

Add code
Apr 16, 2025
Viaarxiv icon

Demystifying Platform Requirements for Diverse LLM Inference Use Cases

Add code
Jun 03, 2024
Figure 1 for Demystifying Platform Requirements for Diverse LLM Inference Use Cases
Figure 2 for Demystifying Platform Requirements for Diverse LLM Inference Use Cases
Figure 3 for Demystifying Platform Requirements for Diverse LLM Inference Use Cases
Figure 4 for Demystifying Platform Requirements for Diverse LLM Inference Use Cases
Viaarxiv icon