Picture for Xintao Wang

Xintao Wang

CineScene: Implicit 3D as Effective Scene Representation for Cinematic Video Generation

Add code
Feb 06, 2026
Viaarxiv icon

Research on World Models Is Not Merely Injecting World Knowledge into Specific Tasks

Add code
Feb 02, 2026
Viaarxiv icon

HER: Human-like Reasoning and Reinforcement Learning for LLM Role-playing

Add code
Jan 29, 2026
Viaarxiv icon

HUMANLLM: Benchmarking and Reinforcing LLM Anthropomorphism via Human Cognitive Patterns

Add code
Jan 15, 2026
Viaarxiv icon

GARDO: Reinforcing Diffusion Models without Reward Hacking

Add code
Dec 30, 2025
Viaarxiv icon

SemanticGen: Video Generation in Semantic Space

Add code
Dec 24, 2025
Figure 1 for SemanticGen: Video Generation in Semantic Space
Figure 2 for SemanticGen: Video Generation in Semantic Space
Figure 3 for SemanticGen: Video Generation in Semantic Space
Figure 4 for SemanticGen: Video Generation in Semantic Space
Viaarxiv icon

Visual-Aware CoT: Achieving High-Fidelity Visual Consistency in Unified Models

Add code
Dec 22, 2025
Viaarxiv icon

In-Context Audio Control of Video Diffusion Transformers

Add code
Dec 21, 2025
Viaarxiv icon

Kling-Omni Technical Report

Add code
Dec 18, 2025
Figure 1 for Kling-Omni Technical Report
Figure 2 for Kling-Omni Technical Report
Figure 3 for Kling-Omni Technical Report
Figure 4 for Kling-Omni Technical Report
Viaarxiv icon

SVG-T2I: Scaling Up Text-to-Image Latent Diffusion Model Without Variational Autoencoder

Add code
Dec 12, 2025
Viaarxiv icon