Picture for Matt Morse

Matt Morse

KeDiff: Key Similarity-Based KV Cache Eviction for Long-Context LLM Inference in Resource-Constrained Environments

Add code
Apr 21, 2025
Viaarxiv icon