Picture for Purab Shingvi

Purab Shingvi

GPU-Accelerated INT8 Quantization for KV Cache Compression in Large Language Models

Add code
Jan 08, 2026
Viaarxiv icon