Picture for Bohan Zeng

Bohan Zeng

LongAV-Compass: Towards Unified Evaluation of Minute-Scale Audio-Visual Generation Across T2AV, I2AV, and V2AV

Add code
May 25, 2026
Viaarxiv icon

LatentOmni: Rethinking Omni-Modal Understanding via Unified Audio-Visual Latent Reasoning

Add code
May 21, 2026
Viaarxiv icon

Artifact-Bench: Evaluating MLLMs on Detecting and Assessing the Artifacts of AI-Generated Videos

Add code
May 18, 2026
Viaarxiv icon

VGGT-Edit: Feed-forward Native 3D Scene Editing with Residual Field Prediction

Add code
May 14, 2026
Viaarxiv icon

OpenWorldLib: A Unified Codebase and Definition of Advanced World Models

Add code
Apr 06, 2026
Viaarxiv icon

DataFlex: A Unified Framework for Data-Centric Dynamic Training of Large Language Models

Add code
Mar 27, 2026
Viaarxiv icon

Towards Next-Generation LLM Training: From the Data-Centric Perspective

Add code
Mar 16, 2026
Viaarxiv icon

AD-MIR: Bridging the Gap from Perception to Persuasion in Advertising Video Understanding via Structured Reasoning

Add code
Feb 07, 2026
Viaarxiv icon

M2A: Multimodal Memory Agent with Dual-Layer Hybrid Memory for Long-Term Personalized Interactions

Add code
Feb 07, 2026
Viaarxiv icon

OmniSIFT: Modality-Asymmetric Token Compression for Efficient Omni-modal Large Language Models

Add code
Feb 04, 2026
Viaarxiv icon