Picture for Jiangning Zhang

Jiangning Zhang

Unveil Inversion and Invariance in Flow Transformer for Versatile Image Editing

Add code
Nov 26, 2024
Figure 1 for Unveil Inversion and Invariance in Flow Transformer for Versatile Image Editing
Figure 2 for Unveil Inversion and Invariance in Flow Transformer for Versatile Image Editing
Figure 3 for Unveil Inversion and Invariance in Flow Transformer for Versatile Image Editing
Figure 4 for Unveil Inversion and Invariance in Flow Transformer for Versatile Image Editing
Viaarxiv icon

Sonic: Shifting Focus to Global Audio Perception in Portrait Animation

Add code
Nov 25, 2024
Figure 1 for Sonic: Shifting Focus to Global Audio Perception in Portrait Animation
Figure 2 for Sonic: Shifting Focus to Global Audio Perception in Portrait Animation
Figure 3 for Sonic: Shifting Focus to Global Audio Perception in Portrait Animation
Figure 4 for Sonic: Shifting Focus to Global Audio Perception in Portrait Animation
Viaarxiv icon

MobileMamba: Lightweight Multi-Receptive Visual Mamba Network

Add code
Nov 24, 2024
Figure 1 for MobileMamba: Lightweight Multi-Receptive Visual Mamba Network
Figure 2 for MobileMamba: Lightweight Multi-Receptive Visual Mamba Network
Figure 3 for MobileMamba: Lightweight Multi-Receptive Visual Mamba Network
Figure 4 for MobileMamba: Lightweight Multi-Receptive Visual Mamba Network
Viaarxiv icon

FitDiT: Advancing the Authentic Garment Details for High-fidelity Virtual Try-on

Add code
Nov 22, 2024
Figure 1 for FitDiT: Advancing the Authentic Garment Details for High-fidelity Virtual Try-on
Figure 2 for FitDiT: Advancing the Authentic Garment Details for High-fidelity Virtual Try-on
Figure 3 for FitDiT: Advancing the Authentic Garment Details for High-fidelity Virtual Try-on
Figure 4 for FitDiT: Advancing the Authentic Garment Details for High-fidelity Virtual Try-on
Viaarxiv icon

Textual Decomposition Then Sub-motion-space Scattering for Open-Vocabulary Motion Generation

Add code
Nov 06, 2024
Figure 1 for Textual Decomposition Then Sub-motion-space Scattering for Open-Vocabulary Motion Generation
Figure 2 for Textual Decomposition Then Sub-motion-space Scattering for Open-Vocabulary Motion Generation
Figure 3 for Textual Decomposition Then Sub-motion-space Scattering for Open-Vocabulary Motion Generation
Figure 4 for Textual Decomposition Then Sub-motion-space Scattering for Open-Vocabulary Motion Generation
Viaarxiv icon

LLaVA-KD: A Framework of Distilling Multimodal Large Language Models

Add code
Oct 21, 2024
Viaarxiv icon

OSV: One Step is Enough for High-Quality Image to Video Generation

Add code
Sep 17, 2024
Figure 1 for OSV: One Step is Enough for High-Quality Image to Video Generation
Figure 2 for OSV: One Step is Enough for High-Quality Image to Video Generation
Figure 3 for OSV: One Step is Enough for High-Quality Image to Video Generation
Figure 4 for OSV: One Step is Enough for High-Quality Image to Video Generation
Viaarxiv icon

SaRA: High-Efficient Diffusion Model Fine-tuning with Progressive Sparse Low-Rank Adaptation

Add code
Sep 10, 2024
Viaarxiv icon

Temporal and Interactive Modeling for Efficient Human-Human Motion Generation

Add code
Aug 30, 2024
Figure 1 for Temporal and Interactive Modeling for Efficient Human-Human Motion Generation
Figure 2 for Temporal and Interactive Modeling for Efficient Human-Human Motion Generation
Figure 3 for Temporal and Interactive Modeling for Efficient Human-Human Motion Generation
Figure 4 for Temporal and Interactive Modeling for Efficient Human-Human Motion Generation
Viaarxiv icon

LLaVA-VSD: Large Language-and-Vision Assistant for Visual Spatial Description

Add code
Aug 09, 2024
Viaarxiv icon