Picture for Xiaoyu Shen

Xiaoyu Shen

From Static Inference to Dynamic Interaction: Navigating the Landscape of Streaming Large Language Models

Add code
Mar 04, 2026
Viaarxiv icon

Think-as-You-See: Streaming Chain-of-Thought Reasoning for Large Vision-Language Models

Add code
Mar 03, 2026
Viaarxiv icon

Beyond Global Similarity: Towards Fine-Grained, Multi-Condition Multimodal Retrieval

Add code
Mar 01, 2026
Viaarxiv icon

What Do Visual Tokens Really Encode? Uncovering Sparsity and Redundancy in Multimodal Large Language Models

Add code
Feb 28, 2026
Viaarxiv icon

HiDrop: Hierarchical Vision Token Reduction in MLLMs via Late Injection, Concave Pyramid Pruning, and Early Exit

Add code
Feb 27, 2026
Viaarxiv icon

UTPTrack: Towards Simple and Unified Token Pruning for Visual Tracking

Add code
Feb 27, 2026
Viaarxiv icon

Rethinking the Role of LLMs in Time Series Forecasting

Add code
Feb 16, 2026
Viaarxiv icon

On-Policy Supervised Fine-Tuning for Efficient Reasoning

Add code
Feb 13, 2026
Viaarxiv icon

ViCA: Efficient Multimodal LLMs with Vision-Only Cross-Attention

Add code
Feb 07, 2026
Viaarxiv icon

From LLMs to LRMs: Rethinking Pruning for Reasoning-Centric Models

Add code
Jan 26, 2026
Viaarxiv icon