Picture for Keyi Kong

Keyi Kong

Addressing Overthinking in Large Vision-Language Models via Gated Perception-Reasoning Optimization

Add code
Jan 07, 2026
Viaarxiv icon

SoundMind: RL-Incentivized Logic Reasoning for Audio-Language Models

Add code
Jun 15, 2025
Viaarxiv icon

Music's Multimodal Complexity in AVQA: Why We Need More than General Multimodal LLMs

Add code
May 27, 2025
Viaarxiv icon