Picture for Shunqi Mao

Shunqi Mao

LLM-Enhanced Rapid-Reflex Async-Reflect Embodied Agent for Real-Time Decision-Making in Dynamically Changing Environments

Add code
Jun 08, 2025
Viaarxiv icon

Through the Magnifying Glass: Adaptive Perception Magnification for Hallucination-Free VLM Decoding

Add code
Mar 13, 2025
Viaarxiv icon

Controllable Contextualized Image Captioning: Directing the Visual Narrative through User-Defined Highlights

Add code
Jul 16, 2024
Viaarxiv icon

Towards Generalisable Audio Representations for Audio-Visual Navigation

Add code
Jun 01, 2022
Figure 1 for Towards Generalisable Audio Representations for Audio-Visual Navigation
Figure 2 for Towards Generalisable Audio Representations for Audio-Visual Navigation
Viaarxiv icon