Picture for Zhengang Wang

Zhengang Wang

Centrum: Model-based Database Auto-tuning with Minimal Distributional Assumptions

Add code
Oct 26, 2025
Viaarxiv icon

Speculative MoE: Communication Efficient Parallel MoE Inference with Speculative Token and Expert Pre-scheduling

Add code
Mar 07, 2025
Figure 1 for Speculative MoE: Communication Efficient Parallel MoE Inference with Speculative Token and Expert Pre-scheduling
Figure 2 for Speculative MoE: Communication Efficient Parallel MoE Inference with Speculative Token and Expert Pre-scheduling
Figure 3 for Speculative MoE: Communication Efficient Parallel MoE Inference with Speculative Token and Expert Pre-scheduling
Figure 4 for Speculative MoE: Communication Efficient Parallel MoE Inference with Speculative Token and Expert Pre-scheduling
Viaarxiv icon