Picture for Yu Qiao

Yu Qiao

ShenZhen Key Lab of Computer Vision and Pattern Recognition, SIAT-SenseTime Joint Lab, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, SIAT Branch, Shenzhen Institute of Artificial Intelligence and Robotics for Society

TrustGeoGen: Scalable and Formal-Verified Data Engine for Trustworthy Multi-modal Geometric Problem Solving

Add code
Apr 22, 2025
Viaarxiv icon

The Devil is in the Prompts: Retrieval-Augmented Prompt Optimization for Text-to-Video Generation

Add code
Apr 16, 2025
Viaarxiv icon

InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models

Add code
Apr 15, 2025
Viaarxiv icon

VideoChat-R1: Enhancing Spatio-Temporal Perception via Reinforcement Fine-Tuning

Add code
Apr 10, 2025
Viaarxiv icon

Lumina-OmniLV: A Unified Multimodal Framework for General Low-Level Vision

Add code
Apr 08, 2025
Viaarxiv icon

ArchCAD-400K: An Open Large-Scale Architectural CAD Dataset and New Baseline for Panoptic Symbol Spotting

Add code
Apr 02, 2025
Viaarxiv icon

Lumina-Image 2.0: A Unified and Efficient Image Generative Framework

Add code
Mar 27, 2025
Viaarxiv icon

VBench-2.0: Advancing Video Generation Benchmark Suite for Intrinsic Faithfulness

Add code
Mar 27, 2025
Viaarxiv icon

LeX-Art: Rethinking Text Generation via Scalable High-Quality Data Synthesis

Add code
Mar 27, 2025
Viaarxiv icon

AccVideo: Accelerating Video Diffusion Model with Synthetic Dataset

Add code
Mar 25, 2025
Viaarxiv icon