SAW-INT4: System-Aware 4-Bit KV-Cache Quantization for Real-World LLM Serving

Add code
Apr 21, 2026

Share this with someone who'll enjoy it:

View paper onarxiv icon

Share this with someone who'll enjoy it: