Picture for Changqian Yu

Changqian Yu

Ingredients: Blending Custom Photos with Video Diffusion Transformers

Add code
Jan 03, 2025
Figure 1 for Ingredients: Blending Custom Photos with Video Diffusion Transformers
Figure 2 for Ingredients: Blending Custom Photos with Video Diffusion Transformers
Figure 3 for Ingredients: Blending Custom Photos with Video Diffusion Transformers
Figure 4 for Ingredients: Blending Custom Photos with Video Diffusion Transformers
Viaarxiv icon

Video Diffusion Transformers are In-Context Learners

Add code
Dec 14, 2024
Figure 1 for Video Diffusion Transformers are In-Context Learners
Figure 2 for Video Diffusion Transformers are In-Context Learners
Figure 3 for Video Diffusion Transformers are In-Context Learners
Viaarxiv icon

MovieCharacter: A Tuning-Free Framework for Controllable Character Video Synthesis

Add code
Oct 28, 2024
Figure 1 for MovieCharacter: A Tuning-Free Framework for Controllable Character Video Synthesis
Figure 2 for MovieCharacter: A Tuning-Free Framework for Controllable Character Video Synthesis
Figure 3 for MovieCharacter: A Tuning-Free Framework for Controllable Character Video Synthesis
Figure 4 for MovieCharacter: A Tuning-Free Framework for Controllable Character Video Synthesis
Viaarxiv icon

FLUX that Plays Music

Add code
Sep 01, 2024
Figure 1 for FLUX that Plays Music
Figure 2 for FLUX that Plays Music
Figure 3 for FLUX that Plays Music
Figure 4 for FLUX that Plays Music
Viaarxiv icon

Scaling Diffusion Transformers to 16 Billion Parameters

Add code
Jul 16, 2024
Viaarxiv icon

FutureNet-LOF: Joint Trajectory Prediction and Lane Occupancy Field Prediction with Future Context Encoding

Add code
Jun 20, 2024
Figure 1 for FutureNet-LOF: Joint Trajectory Prediction and Lane Occupancy Field Prediction with Future Context Encoding
Figure 2 for FutureNet-LOF: Joint Trajectory Prediction and Lane Occupancy Field Prediction with Future Context Encoding
Figure 3 for FutureNet-LOF: Joint Trajectory Prediction and Lane Occupancy Field Prediction with Future Context Encoding
Figure 4 for FutureNet-LOF: Joint Trajectory Prediction and Lane Occupancy Field Prediction with Future Context Encoding
Viaarxiv icon

Dimba: Transformer-Mamba Diffusion Models

Add code
Jun 03, 2024
Figure 1 for Dimba: Transformer-Mamba Diffusion Models
Figure 2 for Dimba: Transformer-Mamba Diffusion Models
Figure 3 for Dimba: Transformer-Mamba Diffusion Models
Figure 4 for Dimba: Transformer-Mamba Diffusion Models
Viaarxiv icon

Diffusion-RWKV: Scaling RWKV-Like Architectures for Diffusion Models

Add code
Apr 06, 2024
Figure 1 for Diffusion-RWKV: Scaling RWKV-Like Architectures for Diffusion Models
Figure 2 for Diffusion-RWKV: Scaling RWKV-Like Architectures for Diffusion Models
Figure 3 for Diffusion-RWKV: Scaling RWKV-Like Architectures for Diffusion Models
Figure 4 for Diffusion-RWKV: Scaling RWKV-Like Architectures for Diffusion Models
Viaarxiv icon

Scalable Diffusion Models with State Space Backbone

Add code
Feb 25, 2024
Viaarxiv icon

SCTNet: Single-Branch CNN with Transformer Semantic Information for Real-Time Segmentation

Add code
Jan 15, 2024
Viaarxiv icon