How Much Dense Attention is Necessary? Oracle-Guided Sparse Prefill for Full/GQA Layers in Hybrid Long-Context Models

Add code
Jun 05, 2026

Share this with someone who'll enjoy it:

View paper onarxiv icon

Share this with someone who'll enjoy it: