Picture for Yue Ding

Yue Ding

LongAV-Compass: Towards Unified Evaluation of Minute-Scale Audio-Visual Generation Across T2AV, I2AV, and V2AV

Add code
May 25, 2026
Viaarxiv icon

LatentOmni: Rethinking Omni-Modal Understanding via Unified Audio-Visual Latent Reasoning

Add code
May 21, 2026
Viaarxiv icon

Artifact-Bench: Evaluating MLLMs on Detecting and Assessing the Artifacts of AI-Generated Videos

Add code
May 18, 2026
Viaarxiv icon

OpenWorldLib: A Unified Codebase and Definition of Advanced World Models

Add code
Apr 06, 2026
Viaarxiv icon

HACMatch Semi-Supervised Rotation Regression with Hardness-Aware Curriculum Pseudo Labeling

Add code
Mar 23, 2026
Viaarxiv icon

AndroTMem: From Interaction Trajectories to Anchored Memory in Long-Horizon GUI Agents

Add code
Mar 19, 2026
Viaarxiv icon

OmniSIFT: Modality-Asymmetric Token Compression for Efficient Omni-modal Large Language Models

Add code
Feb 04, 2026
Viaarxiv icon

Semantic Routing: Exploring Multi-Layer LLM Feature Weighting for Diffusion Transformers

Add code
Feb 03, 2026
Viaarxiv icon

Research on World Models Is Not Merely Injecting World Knowledge into Specific Tasks

Add code
Feb 02, 2026
Viaarxiv icon

DiaDem: Advancing Dialogue Descriptions in Audiovisual Video Captioning for Multimodal Large Language Models

Add code
Jan 27, 2026
Viaarxiv icon