Picture for Minchen Yu

Minchen Yu

UniScale: Adaptive Unified Inference Scaling via Online Joint Optimization of Model Routing and Test-Time Scaling

Add code
May 29, 2026
Viaarxiv icon

SpecServe: Efficient and SLO-Aware Large Language Model Serving with Adaptive Speculative Decoding

Add code
Mar 07, 2025
Viaarxiv icon