Picture for Chengjie Wang

Chengjie Wang

LLaVA-VSD: Large Language-and-Vision Assistant for Visual Spatial Description

Add code
Aug 09, 2024
Figure 1 for LLaVA-VSD: Large Language-and-Vision Assistant for Visual Spatial Description
Figure 2 for LLaVA-VSD: Large Language-and-Vision Assistant for Visual Spatial Description
Figure 3 for LLaVA-VSD: Large Language-and-Vision Assistant for Visual Spatial Description
Figure 4 for LLaVA-VSD: Large Language-and-Vision Assistant for Visual Spatial Description
Viaarxiv icon

MDT-A2G: Exploring Masked Diffusion Transformers for Co-Speech Gesture Generation

Add code
Aug 06, 2024
Figure 1 for MDT-A2G: Exploring Masked Diffusion Transformers for Co-Speech Gesture Generation
Figure 2 for MDT-A2G: Exploring Masked Diffusion Transformers for Co-Speech Gesture Generation
Figure 3 for MDT-A2G: Exploring Masked Diffusion Transformers for Co-Speech Gesture Generation
Figure 4 for MDT-A2G: Exploring Masked Diffusion Transformers for Co-Speech Gesture Generation
Viaarxiv icon

Learning Multi-view Anomaly Detection

Add code
Jul 16, 2024
Figure 1 for Learning Multi-view Anomaly Detection
Figure 2 for Learning Multi-view Anomaly Detection
Figure 3 for Learning Multi-view Anomaly Detection
Figure 4 for Learning Multi-view Anomaly Detection
Viaarxiv icon

PSPU: Enhanced Positive and Unlabeled Learning by Leveraging Pseudo Supervision

Add code
Jul 09, 2024
Figure 1 for PSPU: Enhanced Positive and Unlabeled Learning by Leveraging Pseudo Supervision
Figure 2 for PSPU: Enhanced Positive and Unlabeled Learning by Leveraging Pseudo Supervision
Figure 3 for PSPU: Enhanced Positive and Unlabeled Learning by Leveraging Pseudo Supervision
Figure 4 for PSPU: Enhanced Positive and Unlabeled Learning by Leveraging Pseudo Supervision
Viaarxiv icon

Oracle Bone Inscriptions Multi-modal Dataset

Add code
Jul 04, 2024
Figure 1 for Oracle Bone Inscriptions Multi-modal Dataset
Figure 2 for Oracle Bone Inscriptions Multi-modal Dataset
Viaarxiv icon

Enhancing Multi-Class Anomaly Detection via Diffusion Refinement with Dual Conditioning

Add code
Jul 02, 2024
Viaarxiv icon

RealTalk: Real-time and Realistic Audio-driven Face Generation with 3D Facial Prior-guided Identity Alignment Network

Add code
Jun 26, 2024
Viaarxiv icon

DF40: Toward Next-Generation Deepfake Detection

Add code
Jun 19, 2024
Figure 1 for DF40: Toward Next-Generation Deepfake Detection
Figure 2 for DF40: Toward Next-Generation Deepfake Detection
Figure 3 for DF40: Toward Next-Generation Deepfake Detection
Figure 4 for DF40: Toward Next-Generation Deepfake Detection
Viaarxiv icon

AnyMaker: Zero-shot General Object Customization via Decoupled Dual-Level ID Injection

Add code
Jun 17, 2024
Figure 1 for AnyMaker: Zero-shot General Object Customization via Decoupled Dual-Level ID Injection
Figure 2 for AnyMaker: Zero-shot General Object Customization via Decoupled Dual-Level ID Injection
Figure 3 for AnyMaker: Zero-shot General Object Customization via Decoupled Dual-Level ID Injection
Figure 4 for AnyMaker: Zero-shot General Object Customization via Decoupled Dual-Level ID Injection
Viaarxiv icon

ADer: A Comprehensive Benchmark for Multi-class Visual Anomaly Detection

Add code
Jun 06, 2024
Viaarxiv icon