Picture for Yitao Hu

Yitao Hu

Mosaic: Unlocking Long-Context Inference for Diffusion LLMs via Global Memory Planning and Dynamic Peak Taming

Add code
Jan 10, 2026
Viaarxiv icon

RAGPulse: An Open-Source RAG Workload Trace to Optimize RAG Serving Systems

Add code
Nov 17, 2025
Viaarxiv icon

ServerlessLoRA: Minimizing Latency and Cost in Serverless Inference for LoRA-Based LLMs

Add code
May 20, 2025
Viaarxiv icon