Picture for Hairui Zhao

Hairui Zhao

KVServe: Service-Aware KV Cache Compression for Communication-Efficient Disaggregated LLM Serving

Add code
May 13, 2026
Viaarxiv icon

CCL-D: A High-Precision Diagnostic System for Slow and Hang Anomalies in Large-Scale Model Training

Add code
May 06, 2026
Viaarxiv icon

TACO: Efficient Communication Compression of Intermediate Tensors for Scalable Tensor-Parallel LLM Training

Add code
Apr 27, 2026
Viaarxiv icon

FT-Transformer: Resilient and Reliable Transformer with End-to-End Fault Tolerant Attention

Add code
Apr 03, 2025
Viaarxiv icon