Picture for Fanheng Kong

Fanheng Kong

Evaluating Multimodal Large Language Models on Video Captioning via Monte Carlo Tree Search

Add code
Jun 11, 2025
Viaarxiv icon

Modality Curation: Building Universal Embeddings for Advanced Multimodal Information Retrieval

Add code
May 26, 2025
Viaarxiv icon

TUNA: Comprehensive Fine-grained Temporal Understanding Evaluation on Dense Dynamic Videos

Add code
May 26, 2025
Viaarxiv icon

Is Mamba Effective for Time Series Forecasting?

Add code
Mar 17, 2024
Viaarxiv icon

StickerConv: Generating Multimodal Empathetic Responses from Scratch

Add code
Jan 20, 2024
Viaarxiv icon