Picture for Yabiao Wang

Yabiao Wang

SaRA: High-Efficient Diffusion Model Fine-tuning with Progressive Sparse Low-Rank Adaptation

Add code
Sep 10, 2024
Figure 1 for SaRA: High-Efficient Diffusion Model Fine-tuning with Progressive Sparse Low-Rank Adaptation
Figure 2 for SaRA: High-Efficient Diffusion Model Fine-tuning with Progressive Sparse Low-Rank Adaptation
Figure 3 for SaRA: High-Efficient Diffusion Model Fine-tuning with Progressive Sparse Low-Rank Adaptation
Figure 4 for SaRA: High-Efficient Diffusion Model Fine-tuning with Progressive Sparse Low-Rank Adaptation
Viaarxiv icon

Temporal and Interactive Modeling for Efficient Human-Human Motion Generation

Add code
Aug 30, 2024
Figure 1 for Temporal and Interactive Modeling for Efficient Human-Human Motion Generation
Figure 2 for Temporal and Interactive Modeling for Efficient Human-Human Motion Generation
Figure 3 for Temporal and Interactive Modeling for Efficient Human-Human Motion Generation
Figure 4 for Temporal and Interactive Modeling for Efficient Human-Human Motion Generation
Viaarxiv icon

DualAnoDiff: Dual-Interrelated Diffusion Model for Few-Shot Anomaly Image Generation

Add code
Aug 24, 2024
Figure 1 for DualAnoDiff: Dual-Interrelated Diffusion Model for Few-Shot Anomaly Image Generation
Figure 2 for DualAnoDiff: Dual-Interrelated Diffusion Model for Few-Shot Anomaly Image Generation
Figure 3 for DualAnoDiff: Dual-Interrelated Diffusion Model for Few-Shot Anomaly Image Generation
Figure 4 for DualAnoDiff: Dual-Interrelated Diffusion Model for Few-Shot Anomaly Image Generation
Viaarxiv icon

LLaVA-VSD: Large Language-and-Vision Assistant for Visual Spatial Description

Add code
Aug 09, 2024
Figure 1 for LLaVA-VSD: Large Language-and-Vision Assistant for Visual Spatial Description
Figure 2 for LLaVA-VSD: Large Language-and-Vision Assistant for Visual Spatial Description
Figure 3 for LLaVA-VSD: Large Language-and-Vision Assistant for Visual Spatial Description
Figure 4 for LLaVA-VSD: Large Language-and-Vision Assistant for Visual Spatial Description
Viaarxiv icon

MDT-A2G: Exploring Masked Diffusion Transformers for Co-Speech Gesture Generation

Add code
Aug 06, 2024
Figure 1 for MDT-A2G: Exploring Masked Diffusion Transformers for Co-Speech Gesture Generation
Figure 2 for MDT-A2G: Exploring Masked Diffusion Transformers for Co-Speech Gesture Generation
Figure 3 for MDT-A2G: Exploring Masked Diffusion Transformers for Co-Speech Gesture Generation
Figure 4 for MDT-A2G: Exploring Masked Diffusion Transformers for Co-Speech Gesture Generation
Viaarxiv icon

ADer: A Comprehensive Benchmark for Multi-class Visual Anomaly Detection

Add code
Jun 06, 2024
Viaarxiv icon

AdapNet: Adaptive Noise-Based Network for Low-Quality Image Retrieval

Add code
May 28, 2024
Figure 1 for AdapNet: Adaptive Noise-Based Network for Low-Quality Image Retrieval
Figure 2 for AdapNet: Adaptive Noise-Based Network for Low-Quality Image Retrieval
Figure 3 for AdapNet: Adaptive Noise-Based Network for Low-Quality Image Retrieval
Figure 4 for AdapNet: Adaptive Noise-Based Network for Low-Quality Image Retrieval
Viaarxiv icon

VividPose: Advancing Stable Video Diffusion for Realistic Human Image Animation

Add code
May 28, 2024
Figure 1 for VividPose: Advancing Stable Video Diffusion for Realistic Human Image Animation
Figure 2 for VividPose: Advancing Stable Video Diffusion for Realistic Human Image Animation
Figure 3 for VividPose: Advancing Stable Video Diffusion for Realistic Human Image Animation
Figure 4 for VividPose: Advancing Stable Video Diffusion for Realistic Human Image Animation
Viaarxiv icon

Open-Vocabulary SAM3D: Understand Any 3D Scene

Add code
May 24, 2024
Figure 1 for Open-Vocabulary SAM3D: Understand Any 3D Scene
Figure 2 for Open-Vocabulary SAM3D: Understand Any 3D Scene
Figure 3 for Open-Vocabulary SAM3D: Understand Any 3D Scene
Figure 4 for Open-Vocabulary SAM3D: Understand Any 3D Scene
Viaarxiv icon

PointRWKV: Efficient RWKV-Like Model for Hierarchical Point Cloud Learning

Add code
May 24, 2024
Viaarxiv icon