Picture for Yanwu Xu

Yanwu Xu

South China University of Technology

Project Imaging-X: A Survey of 1000+ Open-Access Medical Imaging Datasets for Foundation Model Development

Add code
Mar 29, 2026
Viaarxiv icon

Mobile-VTON: High-Fidelity On-Device Virtual Try-On

Add code
Mar 03, 2026
Viaarxiv icon

FCMBench: A Comprehensive Financial Credit Multimodal Benchmark for Real-world Applications

Add code
Jan 06, 2026
Viaarxiv icon

Rethinking Diffusion-Based Image Generators for Fundus Fluorescein Angiography Synthesis on Limited Data

Add code
Dec 17, 2024
Viaarxiv icon

Precision-Enhanced Human-Object Contact Detection via Depth-Aware Perspective Interaction and Object Texture Restoration

Add code
Dec 13, 2024
Figure 1 for Precision-Enhanced Human-Object Contact Detection via Depth-Aware Perspective Interaction and Object Texture Restoration
Figure 2 for Precision-Enhanced Human-Object Contact Detection via Depth-Aware Perspective Interaction and Object Texture Restoration
Figure 3 for Precision-Enhanced Human-Object Contact Detection via Depth-Aware Perspective Interaction and Object Texture Restoration
Figure 4 for Precision-Enhanced Human-Object Contact Detection via Depth-Aware Perspective Interaction and Object Texture Restoration
Viaarxiv icon

Semi-IIN: Semi-supervised Intra-inter modal Interaction Learning Network for Multimodal Sentiment Analysis

Add code
Dec 13, 2024
Figure 1 for Semi-IIN: Semi-supervised Intra-inter modal Interaction Learning Network for Multimodal Sentiment Analysis
Figure 2 for Semi-IIN: Semi-supervised Intra-inter modal Interaction Learning Network for Multimodal Sentiment Analysis
Figure 3 for Semi-IIN: Semi-supervised Intra-inter modal Interaction Learning Network for Multimodal Sentiment Analysis
Figure 4 for Semi-IIN: Semi-supervised Intra-inter modal Interaction Learning Network for Multimodal Sentiment Analysis
Viaarxiv icon

SnapGen-V: Generating a Five-Second Video within Five Seconds on a Mobile Device

Add code
Dec 13, 2024
Figure 1 for SnapGen-V: Generating a Five-Second Video within Five Seconds on a Mobile Device
Figure 2 for SnapGen-V: Generating a Five-Second Video within Five Seconds on a Mobile Device
Figure 3 for SnapGen-V: Generating a Five-Second Video within Five Seconds on a Mobile Device
Figure 4 for SnapGen-V: Generating a Five-Second Video within Five Seconds on a Mobile Device
Viaarxiv icon

SnapGen: Taming High-Resolution Text-to-Image Models for Mobile Devices with Efficient Architectures and Training

Add code
Dec 12, 2024
Viaarxiv icon

TED-VITON: Transformer-Empowered Diffusion Models for Virtual Try-On

Add code
Nov 26, 2024
Figure 1 for TED-VITON: Transformer-Empowered Diffusion Models for Virtual Try-On
Figure 2 for TED-VITON: Transformer-Empowered Diffusion Models for Virtual Try-On
Figure 3 for TED-VITON: Transformer-Empowered Diffusion Models for Virtual Try-On
Figure 4 for TED-VITON: Transformer-Empowered Diffusion Models for Virtual Try-On
Viaarxiv icon

Pre-trained Molecular Language Models with Random Functional Group Masking

Add code
Nov 03, 2024
Figure 1 for Pre-trained Molecular Language Models with Random Functional Group Masking
Figure 2 for Pre-trained Molecular Language Models with Random Functional Group Masking
Figure 3 for Pre-trained Molecular Language Models with Random Functional Group Masking
Figure 4 for Pre-trained Molecular Language Models with Random Functional Group Masking
Viaarxiv icon