Picture for Ying Shan

Ying Shan

VideoMaker: Zero-shot Customized Video Generation with the Inherent Force of Video Diffusion Models

Add code
Dec 27, 2024
Viaarxiv icon

DiTCtrl: Exploring Attention Control in Multi-Modal Diffusion Transformer for Tuning-Free Multi-Prompt Longer Video Generation

Add code
Dec 24, 2024
Figure 1 for DiTCtrl: Exploring Attention Control in Multi-Modal Diffusion Transformer for Tuning-Free Multi-Prompt Longer Video Generation
Figure 2 for DiTCtrl: Exploring Attention Control in Multi-Modal Diffusion Transformer for Tuning-Free Multi-Prompt Longer Video Generation
Figure 3 for DiTCtrl: Exploring Attention Control in Multi-Modal Diffusion Transformer for Tuning-Free Multi-Prompt Longer Video Generation
Figure 4 for DiTCtrl: Exploring Attention Control in Multi-Modal Diffusion Transformer for Tuning-Free Multi-Prompt Longer Video Generation
Viaarxiv icon

Consistent Human Image and Video Generation with Spatially Conditioned Diffusion

Add code
Dec 19, 2024
Figure 1 for Consistent Human Image and Video Generation with Spatially Conditioned Diffusion
Figure 2 for Consistent Human Image and Video Generation with Spatially Conditioned Diffusion
Figure 3 for Consistent Human Image and Video Generation with Spatially Conditioned Diffusion
Figure 4 for Consistent Human Image and Video Generation with Spatially Conditioned Diffusion
Viaarxiv icon

DI-PCG: Diffusion-based Efficient Inverse Procedural Content Generation for High-quality 3D Asset Creation

Add code
Dec 19, 2024
Figure 1 for DI-PCG: Diffusion-based Efficient Inverse Procedural Content Generation for High-quality 3D Asset Creation
Figure 2 for DI-PCG: Diffusion-based Efficient Inverse Procedural Content Generation for High-quality 3D Asset Creation
Figure 3 for DI-PCG: Diffusion-based Efficient Inverse Procedural Content Generation for High-quality 3D Asset Creation
Figure 4 for DI-PCG: Diffusion-based Efficient Inverse Procedural Content Generation for High-quality 3D Asset Creation
Viaarxiv icon

ColorFlow: Retrieval-Augmented Image Sequence Colorization

Add code
Dec 16, 2024
Viaarxiv icon

NeRF-Texture: Synthesizing Neural Radiance Field Textures

Add code
Dec 13, 2024
Figure 1 for NeRF-Texture: Synthesizing Neural Radiance Field Textures
Figure 2 for NeRF-Texture: Synthesizing Neural Radiance Field Textures
Figure 3 for NeRF-Texture: Synthesizing Neural Radiance Field Textures
Figure 4 for NeRF-Texture: Synthesizing Neural Radiance Field Textures
Viaarxiv icon

BrushEdit: All-In-One Image Inpainting and Editing

Add code
Dec 13, 2024
Figure 1 for BrushEdit: All-In-One Image Inpainting and Editing
Figure 2 for BrushEdit: All-In-One Image Inpainting and Editing
Figure 3 for BrushEdit: All-In-One Image Inpainting and Editing
Figure 4 for BrushEdit: All-In-One Image Inpainting and Editing
Viaarxiv icon

FreeSplatter: Pose-free Gaussian Splatting for Sparse-view 3D Reconstruction

Add code
Dec 12, 2024
Figure 1 for FreeSplatter: Pose-free Gaussian Splatting for Sparse-view 3D Reconstruction
Figure 2 for FreeSplatter: Pose-free Gaussian Splatting for Sparse-view 3D Reconstruction
Figure 3 for FreeSplatter: Pose-free Gaussian Splatting for Sparse-view 3D Reconstruction
Figure 4 for FreeSplatter: Pose-free Gaussian Splatting for Sparse-view 3D Reconstruction
Viaarxiv icon

MuMu-LLaMA: Multi-modal Music Understanding and Generation via Large Language Models

Add code
Dec 09, 2024
Viaarxiv icon

EgoPlan-Bench2: A Benchmark for Multimodal Large Language Model Planning in Real-World Scenarios

Add code
Dec 05, 2024
Figure 1 for EgoPlan-Bench2: A Benchmark for Multimodal Large Language Model Planning in Real-World Scenarios
Figure 2 for EgoPlan-Bench2: A Benchmark for Multimodal Large Language Model Planning in Real-World Scenarios
Figure 3 for EgoPlan-Bench2: A Benchmark for Multimodal Large Language Model Planning in Real-World Scenarios
Figure 4 for EgoPlan-Bench2: A Benchmark for Multimodal Large Language Model Planning in Real-World Scenarios
Viaarxiv icon