Picture for Yijing Chen

Yijing Chen

AVOC: Enhancing Hour-Level Audio-Video Understanding in Omni-Modal LLMs via Retrieval-Inspired Token Compression

Add code
Jun 23, 2026
Viaarxiv icon

Unified Synthesis of Compositional Speech and Sound from Free-Form Text Prompts

Add code
May 27, 2026
Viaarxiv icon

MSJoE: Jointly Evolving MLLM and Sampler for Efficient Long-Form Video Understanding

Add code
Feb 26, 2026
Viaarxiv icon

ChronusOmni: Improving Time Awareness of Omni Large Language Models

Add code
Dec 10, 2025
Viaarxiv icon

REVISOR: Beyond Textual Reflection, Towards Multimodal Introspective Reasoning in Long-Form Video Understanding

Add code
Nov 17, 2025
Viaarxiv icon