Token Reduction


PIO-FVLM: Rethinking Training-Free Visual Token Reduction for VLM Acceleration from an Inference-Objective Perspective

Add code
Feb 05, 2026
Viaarxiv icon

FlashBlock: Attention Caching for Efficient Long-Context Block Diffusion

Add code
Feb 05, 2026
Viaarxiv icon

Laminating Representation Autoencoders for Efficient Diffusion

Add code
Feb 04, 2026
Viaarxiv icon

Reg4Pru: Regularisation Through Random Token Routing for Token Pruning

Add code
Feb 03, 2026
Viaarxiv icon

SpecMD: A Comprehensive Study On Speculative Expert Prefetching

Add code
Feb 03, 2026
Viaarxiv icon

NeuroCanvas: VLLM-Powered Robust Seizure Detection by Reformulating Multichannel EEG as Image

Add code
Feb 04, 2026
Viaarxiv icon

Swordsman: Entropy-Driven Adaptive Block Partition for Efficient Diffusion Language Models

Add code
Feb 04, 2026
Viaarxiv icon

Token Pruning for In-Context Generation in Diffusion Transformers

Add code
Feb 02, 2026
Viaarxiv icon

NEAT: Neuron-Based Early Exit for Large Reasoning Models

Add code
Feb 02, 2026
Viaarxiv icon

CodeOCR: On the Effectiveness of Vision Language Models in Code Understanding

Add code
Feb 02, 2026
Viaarxiv icon