Picture for Yabiao Wang

Yabiao Wang

The devil is in the details: Enhancing Video Virtual Try-On via Keyframe-Driven Details Injection

Add code
Dec 23, 2025
Viaarxiv icon

Transform Trained Transformer: Accelerating Naive 4K Video Generation Over 10$\times$

Add code
Dec 15, 2025
Viaarxiv icon

RoleRMBench & RoleRM: Towards Reward Modeling for Profile-Based Role Play in Dialogue Systems

Add code
Dec 11, 2025
Viaarxiv icon

SwiftVideo: A Unified Framework for Few-Step Video Generation through Trajectory-Distribution Alignment

Add code
Aug 08, 2025
Viaarxiv icon

Reasoning to Edit: Hypothetical Instruction-Based Image Editing with Visual Reasoning

Add code
Jul 02, 2025
Viaarxiv icon

UltraVideo: High-Quality UHD Video Dataset with Comprehensive Captions

Add code
Jun 16, 2025
Viaarxiv icon

Swin DiT: Diffusion Transformer using Pseudo Shifted Windows

Add code
May 19, 2025
Viaarxiv icon

UniCombine: Unified Multi-Conditional Combination with Diffusion Transformer

Add code
Mar 12, 2025
Figure 1 for UniCombine: Unified Multi-Conditional Combination with Diffusion Transformer
Figure 2 for UniCombine: Unified Multi-Conditional Combination with Diffusion Transformer
Figure 3 for UniCombine: Unified Multi-Conditional Combination with Diffusion Transformer
Figure 4 for UniCombine: Unified Multi-Conditional Combination with Diffusion Transformer
Viaarxiv icon

PixelPonder: Dynamic Patch Adaptation for Enhanced Multi-Conditional Text-to-Image Generation

Add code
Mar 09, 2025
Viaarxiv icon

Improving Autoregressive Visual Generation with Cluster-Oriented Token Prediction

Add code
Jan 01, 2025
Figure 1 for Improving Autoregressive Visual Generation with Cluster-Oriented Token Prediction
Figure 2 for Improving Autoregressive Visual Generation with Cluster-Oriented Token Prediction
Figure 3 for Improving Autoregressive Visual Generation with Cluster-Oriented Token Prediction
Figure 4 for Improving Autoregressive Visual Generation with Cluster-Oriented Token Prediction
Viaarxiv icon