Picture for Weile Luo

Weile Luo

ZipServ: Fast and Memory-Efficient LLM Inference with Hardware-Aware Lossless Compression

Add code
Mar 18, 2026
Viaarxiv icon