Picture for Shang-Jui Ray Kuo

Shang-Jui Ray Kuo

Do VLMs Need Vision Transformers? Evaluating State Space Models as Vision Encoders

Add code
Mar 19, 2026
Viaarxiv icon