Picture for Kaisiyuan Wang

Kaisiyuan Wang

MVHOI: Bridge Multi-view Condition to Complex Human-Object Interaction Video Reenactment via 3D Foundation Model

Add code
Mar 16, 2026
Viaarxiv icon

DISPLAY: Directable Human-Object Interaction Video Generation via Sparse Motion Guidance and Multi-Task Auxiliary

Add code
Mar 10, 2026
Viaarxiv icon

iDiT-HOI: Inpainting-based Hand Object Interaction Reenactment via Video Diffusion Transformer

Add code
Jun 15, 2025
Viaarxiv icon

AudCast: Audio-Driven Human Video Generation by Cascaded Diffusion Transformers

Add code
Mar 25, 2025
Figure 1 for AudCast: Audio-Driven Human Video Generation by Cascaded Diffusion Transformers
Figure 2 for AudCast: Audio-Driven Human Video Generation by Cascaded Diffusion Transformers
Figure 3 for AudCast: Audio-Driven Human Video Generation by Cascaded Diffusion Transformers
Figure 4 for AudCast: Audio-Driven Human Video Generation by Cascaded Diffusion Transformers
Viaarxiv icon

Cosh-DiT: Co-Speech Gesture Video Synthesis via Hybrid Audio-Visual Diffusion Transformers

Add code
Mar 13, 2025
Viaarxiv icon

TALK-Act: Enhance Textural-Awareness for 2D Speaking Avatar Reenactment with Diffusion Model

Add code
Oct 14, 2024
Figure 1 for TALK-Act: Enhance Textural-Awareness for 2D Speaking Avatar Reenactment with Diffusion Model
Figure 2 for TALK-Act: Enhance Textural-Awareness for 2D Speaking Avatar Reenactment with Diffusion Model
Figure 3 for TALK-Act: Enhance Textural-Awareness for 2D Speaking Avatar Reenactment with Diffusion Model
Figure 4 for TALK-Act: Enhance Textural-Awareness for 2D Speaking Avatar Reenactment with Diffusion Model
Viaarxiv icon

ReSyncer: Rewiring Style-based Generator for Unified Audio-Visually Synced Facial Performer

Add code
Aug 06, 2024
Figure 1 for ReSyncer: Rewiring Style-based Generator for Unified Audio-Visually Synced Facial Performer
Figure 2 for ReSyncer: Rewiring Style-based Generator for Unified Audio-Visually Synced Facial Performer
Figure 3 for ReSyncer: Rewiring Style-based Generator for Unified Audio-Visually Synced Facial Performer
Figure 4 for ReSyncer: Rewiring Style-based Generator for Unified Audio-Visually Synced Facial Performer
Viaarxiv icon

AVI-Talking: Learning Audio-Visual Instructions for Expressive 3D Talking Face Generation

Add code
Feb 25, 2024
Viaarxiv icon

ObjectSDF++: Improved Object-Compositional Neural Implicit Surfaces

Add code
Aug 17, 2023
Viaarxiv icon

StyleSync: High-Fidelity Generalized and Personalized Lip Sync in Style-based Generator

Add code
May 09, 2023
Figure 1 for StyleSync: High-Fidelity Generalized and Personalized Lip Sync in Style-based Generator
Figure 2 for StyleSync: High-Fidelity Generalized and Personalized Lip Sync in Style-based Generator
Figure 3 for StyleSync: High-Fidelity Generalized and Personalized Lip Sync in Style-based Generator
Figure 4 for StyleSync: High-Fidelity Generalized and Personalized Lip Sync in Style-based Generator
Viaarxiv icon