Picture for Desheng Hui

Desheng Hui

SlimInfer: Accelerating Long-Context LLM Inference via Dynamic Token Pruning

Add code
Aug 08, 2025
Viaarxiv icon