Picture for Pengfei Wan

Pengfei Wan

FullDiT2: Efficient In-Context Conditioning for Video Diffusion Transformers

Add code
Jun 05, 2025
Viaarxiv icon

UNIC: Unified In-Context Video Editing

Add code
Jun 04, 2025
Viaarxiv icon

RICO: Improving Accuracy and Completeness in Image Recaptioning via Visual Reconstruction

Add code
May 28, 2025
Viaarxiv icon

OmniSync: Towards Universal Lip Synchronization via Diffusion Transformers

Add code
May 27, 2025
Viaarxiv icon

MME-VideoOCR: Evaluating OCR-Based Capabilities of Multimodal LLMs in Video Scenarios

Add code
May 27, 2025
Viaarxiv icon

Scaling Image and Video Generation via Test-Time Evolutionary Search

Add code
May 23, 2025
Viaarxiv icon

Training-Free Efficient Video Generation via Dynamic Token Carving

Add code
May 22, 2025
Viaarxiv icon

VFRTok: Variable Frame Rates Video Tokenizer with Duration-Proportional Information Assumption

Add code
May 17, 2025
Viaarxiv icon

Flow-GRPO: Training Flow Matching Models via Online RL

Add code
May 08, 2025
Viaarxiv icon

A Survey of Interactive Generative Video

Add code
Apr 30, 2025
Viaarxiv icon