Picture for Jinda Jia

Jinda Jia

SAW-INT4: System-Aware 4-Bit KV-Cache Quantization for Real-World LLM Serving

Add code
Apr 21, 2026
Viaarxiv icon

SDP4Bit: Toward 4-bit Communication Quantization in Sharded Data Parallelism for LLM Training

Add code
Oct 20, 2024
Figure 1 for SDP4Bit: Toward 4-bit Communication Quantization in Sharded Data Parallelism for LLM Training
Figure 2 for SDP4Bit: Toward 4-bit Communication Quantization in Sharded Data Parallelism for LLM Training
Figure 3 for SDP4Bit: Toward 4-bit Communication Quantization in Sharded Data Parallelism for LLM Training
Figure 4 for SDP4Bit: Toward 4-bit Communication Quantization in Sharded Data Parallelism for LLM Training
Viaarxiv icon