Picture for Shuhang Xun

Shuhang Xun

Temporal Gains, Spatial Costs: Revisiting Video Fine-Tuning in Multimodal Large Language Models

Add code
Mar 18, 2026
Viaarxiv icon

RTV-Bench: Benchmarking MLLM Continuous Perception, Understanding and Reasoning through Real-Time Video

Add code
May 04, 2025
Viaarxiv icon