Picture for Kaiwen Long

Kaiwen Long

QMoP: Query Guided Mixture-of-Projector for Efficient Visual Token Compression

Add code
Mar 22, 2026
Viaarxiv icon

Step-CoT: Stepwise Visual Chain-of-Thought for Medical Visual Question Answering

Add code
Mar 14, 2026
Viaarxiv icon

ITO: Images and Texts as One via Synergizing Multiple Alignment and Training-Time Fusion

Add code
Mar 04, 2026
Viaarxiv icon

iGVLM: Dynamic Instruction-Guided Vision Encoding for Question-Aware Multimodal Understanding

Add code
Mar 03, 2026
Viaarxiv icon

A Survey for Foundation Models in Autonomous Driving

Add code
Feb 02, 2024
Viaarxiv icon