Picture for Manar Ali

Manar Ali

Are Multimodal Large Language Models Pragmatically Competent Listeners in Simple Reference Resolution Tasks?

Add code
Jun 13, 2025
Viaarxiv icon