Picture for Tingting Gao

Tingting Gao

ContextQFormer: A New Context Modeling Method for Multi-Turn Multi-Modal Conversations

Add code
May 29, 2025
Viaarxiv icon

Why Distillation can Outperform Zero-RL: The Role of Flexible Reasoning

Add code
May 27, 2025
Viaarxiv icon

GODBench: A Benchmark for Multimodal Large Language Models in Video Comment Art

Add code
May 16, 2025
Viaarxiv icon

R1-Reward: Training Multimodal Reward Model Through Stable Reinforcement Learning

Add code
May 05, 2025
Viaarxiv icon

SeriesBench: A Benchmark for Narrative-Driven Drama Series Understanding

Add code
Apr 30, 2025
Viaarxiv icon

VLM as Policy: Common-Law Content Moderation Framework for Short Video Platform

Add code
Apr 21, 2025
Viaarxiv icon

InstructEngine: Instruction-driven Text-to-Image Alignment

Add code
Apr 14, 2025
Viaarxiv icon

Decoupling Contrastive Decoding: Robust Hallucination Mitigation in Multimodal Large Language Models

Add code
Apr 09, 2025
Viaarxiv icon

TIME: Temporal-sensitive Multi-dimensional Instruction Tuning and Benchmarking for Video-LLMs

Add code
Mar 13, 2025
Viaarxiv icon

Exo2Ego: Exocentric Knowledge Guided MLLM for Egocentric Video Understanding

Add code
Mar 12, 2025
Viaarxiv icon