Picture for Nuwan Jayasena

Nuwan Jayasena

The $qs$ Inequality: Quantifying the Double Penalty of Mixture-of-Experts at Inference

Add code
Mar 09, 2026
Viaarxiv icon

RAPID-Serve: Resource-efficient and Accelerated P/D Intra-GPU Disaggregation

Add code
Jan 16, 2026
Viaarxiv icon

T3: Transparent Tracking & Triggering for Fine-grained Overlap of Compute & Collectives

Add code
Jan 30, 2024
Figure 1 for T3: Transparent Tracking & Triggering for Fine-grained Overlap of Compute & Collectives
Figure 2 for T3: Transparent Tracking & Triggering for Fine-grained Overlap of Compute & Collectives
Figure 3 for T3: Transparent Tracking & Triggering for Fine-grained Overlap of Compute & Collectives
Figure 4 for T3: Transparent Tracking & Triggering for Fine-grained Overlap of Compute & Collectives
Viaarxiv icon

Demystifying BERT: Implications for Accelerator Design

Add code
Apr 14, 2021
Figure 1 for Demystifying BERT: Implications for Accelerator Design
Figure 2 for Demystifying BERT: Implications for Accelerator Design
Figure 3 for Demystifying BERT: Implications for Accelerator Design
Figure 4 for Demystifying BERT: Implications for Accelerator Design
Viaarxiv icon