Picture for Zhiyi Hong

Zhiyi Hong

Flux Attention: Context-Aware Hybrid Attention for Efficient LLMs Inference

Add code
Apr 08, 2026
Viaarxiv icon

Elastic Attention: Test-time Adaptive Sparsity Ratios for Efficient Transformers

Add code
Jan 24, 2026
Viaarxiv icon