Picture for Xin Tao

Xin Tao

Geometry-Instructed Video Editing

Add code
Jun 23, 2026
Viaarxiv icon

CineCap: Structured Reasoning with Spatio-Temporal Anchors for Cinematographic Video Captioning

Add code
Jun 23, 2026
Viaarxiv icon

UnityShots: Memory-Driven Multi-Shot Audio-Video Generation with Boundary-Aware Gating

Add code
Jun 19, 2026
Viaarxiv icon

Diffusing in the Right Space: A Systematic Study of Latent Diffusability

Add code
Jun 02, 2026
Viaarxiv icon

VLMs are Good Teachers for Video Reasoning via Adaptive Test-Time Optimization

Add code
Jun 01, 2026
Viaarxiv icon

SRC-Flow: Compact Semantic Representations Enable Normalizing Flows for Image Generation

Add code
May 18, 2026
Viaarxiv icon

Amodal SAM: A Unified Amodal Segmentation Framework with Generalization

Add code
Apr 22, 2026
Viaarxiv icon

Stable Velocity: A Variance Perspective on Flow Matching

Add code
Feb 05, 2026
Viaarxiv icon

VMonarch: Efficient Video Diffusion Transformers with Structured Attention

Add code
Jan 29, 2026
Viaarxiv icon

SALAD: Achieve High-Sparsity Attention via Efficient Linear Attention Tuning for Video Diffusion Transformer

Add code
Jan 23, 2026
Viaarxiv icon