Picture for Weiheng Lu

Weiheng Lu

ReDiPrune: Relevance-Diversity Pre-Projection Token Pruning for Efficient Multimodal LLMs

Add code
Mar 25, 2026
Viaarxiv icon

SMART: Shot-Aware Multimodal Video Moment Retrieval with Audio-Enhanced MLLM

Add code
Nov 18, 2025
Viaarxiv icon

DeepEyesV2: Toward Agentic Multimodal Model

Add code
Nov 10, 2025
Viaarxiv icon

LLaVA-MR: Large Language-and-Vision Assistant for Video Moment Retrieval

Add code
Nov 21, 2024
Figure 1 for LLaVA-MR: Large Language-and-Vision Assistant for Video Moment Retrieval
Figure 2 for LLaVA-MR: Large Language-and-Vision Assistant for Video Moment Retrieval
Figure 3 for LLaVA-MR: Large Language-and-Vision Assistant for Video Moment Retrieval
Figure 4 for LLaVA-MR: Large Language-and-Vision Assistant for Video Moment Retrieval
Viaarxiv icon

A Survey on Benchmarks of Multimodal Large Language Models

Add code
Aug 16, 2024
Viaarxiv icon