Picture for Pengfei Wan

Pengfei Wan

CineCap: Structured Reasoning with Spatio-Temporal Anchors for Cinematographic Video Captioning

Add code
Jun 23, 2026
Viaarxiv icon

Geometry-Instructed Video Editing

Add code
Jun 23, 2026
Viaarxiv icon

OmniDirector: General Multi-Shot Camera Cloning without Cross-Paired Data

Add code
Jun 11, 2026
Viaarxiv icon

ARGUS: Stacked Multi-View Identity Mosaic Injection for Subject-Preserving Video Generation

Add code
Jun 10, 2026
Viaarxiv icon

AnchorWorld: Embodied Egocentric World Simulation with View-based Evolution Customization

Add code
Jun 05, 2026
Viaarxiv icon

Edit-R2: Context-Aware Reinforcement Learning for Multi-Turn Image Editing

Add code
Jun 04, 2026
Viaarxiv icon

Diffusing in the Right Space: A Systematic Study of Latent Diffusability

Add code
Jun 02, 2026
Viaarxiv icon

Geometry-Aware Implicit Memory for Video World Models

Add code
Jun 01, 2026
Viaarxiv icon

VLMs are Good Teachers for Video Reasoning via Adaptive Test-Time Optimization

Add code
Jun 01, 2026
Viaarxiv icon

SegTune: Structured and Fine-Grained Control for Song Generation

Add code
May 31, 2026
Viaarxiv icon