Picture for BoYu Wang

BoYu Wang

RedKnot: Efficient Long-Context LLM Serving with Head-Aware KV Reuse and SegPagedAttention

Add code
Jun 04, 2026
Viaarxiv icon