Picture for Weiheng Lu

Weiheng Lu

SMART: Shot-Aware Multimodal Video Moment Retrieval with Audio-Enhanced MLLM

Add code
Nov 18, 2025
Viaarxiv icon

DeepEyesV2: Toward Agentic Multimodal Model

Add code
Nov 10, 2025
Viaarxiv icon

LLaVA-MR: Large Language-and-Vision Assistant for Video Moment Retrieval

Add code
Nov 21, 2024
Figure 1 for LLaVA-MR: Large Language-and-Vision Assistant for Video Moment Retrieval
Figure 2 for LLaVA-MR: Large Language-and-Vision Assistant for Video Moment Retrieval
Figure 3 for LLaVA-MR: Large Language-and-Vision Assistant for Video Moment Retrieval
Figure 4 for LLaVA-MR: Large Language-and-Vision Assistant for Video Moment Retrieval
Viaarxiv icon

A Survey on Benchmarks of Multimodal Large Language Models

Add code
Aug 16, 2024
Viaarxiv icon