Picture for ZhaoKai Luo

ZhaoKai Luo

RedKnot: Efficient Long-Context LLM Serving with Head-Aware KV Reuse and SegPagedAttention

Add code
Jun 04, 2026
Viaarxiv icon