Picture for Qianchao Zhu

Qianchao Zhu

HeteroSpec: Leveraging Contextual Heterogeneity for Efficient Speculative Decoding

Add code
May 19, 2025
Viaarxiv icon

SampleAttention: Near-Lossless Acceleration of Long Context LLM Inference with Adaptive Structured Sparse Attention

Add code
Jun 28, 2024
Viaarxiv icon

Near-Lossless Acceleration of Long Context LLM Inference with Adaptive Structured Sparse Attention

Add code
Jun 17, 2024
Viaarxiv icon