Picture for Guojie Zhu

Guojie Zhu

AVOC: Enhancing Hour-Level Audio-Video Understanding in Omni-Modal LLMs via Retrieval-Inspired Token Compression

Add code
Jun 23, 2026
Viaarxiv icon