Picture for Sookyung Choi

Sookyung Choi

MoSKA: Mixture of Shared KV Attention for Efficient Long-Sequence LLM Inference

Add code
Nov 08, 2025
Viaarxiv icon