Picture for Jungwan Lee

Jungwan Lee

FlashMoE: Reducing SSD I/O Bottlenecks via ML-Based Cache Replacement for Mixture-of-Experts Inference on Edge Devices

Add code
Jan 22, 2026
Viaarxiv icon