Picture for Zejun MA

Zejun MA

video-SALMONN-R$^3$: Learning to ReWatch, ReAsk, and ReAnswer for Efficient Video Understanding

Add code
Jun 23, 2026
Viaarxiv icon

video-SALMONN-o1: Reasoning-enhanced Audio-visual Large Language Model

Add code
Feb 17, 2025
Figure 1 for video-SALMONN-o1: Reasoning-enhanced Audio-visual Large Language Model
Figure 2 for video-SALMONN-o1: Reasoning-enhanced Audio-visual Large Language Model
Figure 3 for video-SALMONN-o1: Reasoning-enhanced Audio-visual Large Language Model
Figure 4 for video-SALMONN-o1: Reasoning-enhanced Audio-visual Large Language Model
Viaarxiv icon