Picture for Keliang Li

Keliang Li

LensWalk: Agentic Video Understanding by Planning How You See in Videos

Add code
Mar 25, 2026
Viaarxiv icon

MMAR: A Challenging Benchmark for Deep Reasoning in Speech, Audio, Music, and Their Mix

Add code
May 19, 2025
Figure 1 for MMAR: A Challenging Benchmark for Deep Reasoning in Speech, Audio, Music, and Their Mix
Figure 2 for MMAR: A Challenging Benchmark for Deep Reasoning in Speech, Audio, Music, and Their Mix
Figure 3 for MMAR: A Challenging Benchmark for Deep Reasoning in Speech, Audio, Music, and Their Mix
Figure 4 for MMAR: A Challenging Benchmark for Deep Reasoning in Speech, Audio, Music, and Their Mix
Viaarxiv icon

HERM: Benchmarking and Enhancing Multimodal LLMs for Human-Centric Understanding

Add code
Oct 09, 2024
Viaarxiv icon