Picture for Zhendong Mao

Zhendong Mao

DailyReport: An Open-ended Benchmark for Evaluating Search Agents on Daily Search Tasks

Add code
Jun 11, 2026
Viaarxiv icon

Audio-Visual Exchange-Aware Token Pruning for Efficient Audio-Visual Captioning

Add code
Jun 09, 2026
Viaarxiv icon

Towards Accurate Emotion-Attributed Video Captioning via Fine-grained Emotion-Cause Pair Extraction

Add code
Jun 07, 2026
Viaarxiv icon

Asuka-Bench: Benchmarking Code Agents on Underspecified User Intent and Multi-Round Refinement

Add code
Jun 04, 2026
Viaarxiv icon

Lance: Unified Multimodal Modeling by Multi-Task Synergy

Add code
May 20, 2026
Viaarxiv icon

Uncertainty-Aware Exploratory Direct Preference Optimization for Multimodal Large Language Models

Add code
May 06, 2026
Viaarxiv icon

Stream-T1: Test-Time Scaling for Streaming Video Generation

Add code
May 06, 2026
Viaarxiv icon

Stream-R1: Reliability-Perplexity Aware Reward Distillation for Streaming Video Generation

Add code
May 05, 2026
Viaarxiv icon

A Multi-Agent Framework with Structured Reasoning and Reflective Refinement for Multimodal Empathetic Response Generation

Add code
Apr 21, 2026
Viaarxiv icon

CreatiParser: Generative Image Parsing of Raster Graphic Designs into Editable Layers

Add code
Apr 21, 2026
Viaarxiv icon