Picture for Xintong Zhu

Xintong Zhu

Can Multimodal Large Language Models Truly Understand Small Objects?

Add code
Apr 24, 2026
Viaarxiv icon