Picture for Yanpeng Wang

Yanpeng Wang

Staggered Batch Scheduling: Co-optimizing Time-to-First-Token and Throughput for High-Efficiency LLM Inference

Add code
Dec 18, 2025
Viaarxiv icon

FEVO: Financial Knowledge Expansion and Reasoning Evolution for Large Language Models

Add code
Jul 09, 2025
Viaarxiv icon

Astra: Efficient and Money-saving Automatic Parallel Strategies Search on Heterogeneous GPUs

Add code
Feb 19, 2025
Viaarxiv icon