Picture for Xintao Wang

Xintao Wang

SemanticGen: Video Generation in Semantic Space

Add code
Dec 24, 2025
Viaarxiv icon

Visual-Aware CoT: Achieving High-Fidelity Visual Consistency in Unified Models

Add code
Dec 22, 2025
Viaarxiv icon

In-Context Audio Control of Video Diffusion Transformers

Add code
Dec 21, 2025
Viaarxiv icon

Kling-Omni Technical Report

Add code
Dec 18, 2025
Viaarxiv icon

SVG-T2I: Scaling Up Text-to-Image Latent Diffusion Model Without Variational Autoencoder

Add code
Dec 12, 2025
Viaarxiv icon

MVU-Eval: Towards Multi-Video Understanding Evaluation for Multimodal LLMs

Add code
Nov 13, 2025
Viaarxiv icon

Simulating the Visual World with Artificial Intelligence: A Roadmap

Add code
Nov 11, 2025
Viaarxiv icon

RelightMaster: Precise Video Relighting with Multi-plane Light Images

Add code
Nov 09, 2025
Viaarxiv icon

OmniX: From Unified Panoramic Generation and Perception to Graphics-Ready 3D Scenes

Add code
Oct 30, 2025
Figure 1 for OmniX: From Unified Panoramic Generation and Perception to Graphics-Ready 3D Scenes
Figure 2 for OmniX: From Unified Panoramic Generation and Perception to Graphics-Ready 3D Scenes
Figure 3 for OmniX: From Unified Panoramic Generation and Perception to Graphics-Ready 3D Scenes
Figure 4 for OmniX: From Unified Panoramic Generation and Perception to Graphics-Ready 3D Scenes
Viaarxiv icon

VFXMaster: Unlocking Dynamic Visual Effect Generation via In-Context Learning

Add code
Oct 29, 2025
Viaarxiv icon