Picture for Yusheng Xie

Yusheng Xie

Diffusion Soup: Model Merging for Text-to-Image Diffusion Models

Add code
Jun 12, 2024
Viaarxiv icon

FairRAG: Fair Human Generation via Fair Retrieval Augmentation

Add code
Apr 05, 2024
Figure 1 for FairRAG: Fair Human Generation via Fair Retrieval Augmentation
Figure 2 for FairRAG: Fair Human Generation via Fair Retrieval Augmentation
Figure 3 for FairRAG: Fair Human Generation via Fair Retrieval Augmentation
Figure 4 for FairRAG: Fair Human Generation via Fair Retrieval Augmentation
Viaarxiv icon

On the Scalability of Diffusion-based Text-to-Image Generation

Add code
Apr 03, 2024
Figure 1 for On the Scalability of Diffusion-based Text-to-Image Generation
Figure 2 for On the Scalability of Diffusion-based Text-to-Image Generation
Figure 3 for On the Scalability of Diffusion-based Text-to-Image Generation
Figure 4 for On the Scalability of Diffusion-based Text-to-Image Generation
Viaarxiv icon

MAGID: An Automated Pipeline for Generating Synthetic Multi-modal Datasets

Add code
Mar 05, 2024
Figure 1 for MAGID: An Automated Pipeline for Generating Synthetic Multi-modal Datasets
Figure 2 for MAGID: An Automated Pipeline for Generating Synthetic Multi-modal Datasets
Figure 3 for MAGID: An Automated Pipeline for Generating Synthetic Multi-modal Datasets
Figure 4 for MAGID: An Automated Pipeline for Generating Synthetic Multi-modal Datasets
Viaarxiv icon

Multiple-Question Multiple-Answer Text-VQA

Add code
Nov 15, 2023
Viaarxiv icon

SimCon Loss with Multiple Views for Text Supervised Semantic Segmentation

Add code
Feb 07, 2023
Figure 1 for SimCon Loss with Multiple Views for Text Supervised Semantic Segmentation
Figure 2 for SimCon Loss with Multiple Views for Text Supervised Semantic Segmentation
Figure 3 for SimCon Loss with Multiple Views for Text Supervised Semantic Segmentation
Figure 4 for SimCon Loss with Multiple Views for Text Supervised Semantic Segmentation
Viaarxiv icon

AIM: Adapting Image Models for Efficient Video Action Recognition

Add code
Feb 06, 2023
Figure 1 for AIM: Adapting Image Models for Efficient Video Action Recognition
Figure 2 for AIM: Adapting Image Models for Efficient Video Action Recognition
Figure 3 for AIM: Adapting Image Models for Efficient Video Action Recognition
Figure 4 for AIM: Adapting Image Models for Efficient Video Action Recognition
Viaarxiv icon

Towards Differential Relational Privacy and its use in Question Answering

Add code
Mar 30, 2022
Figure 1 for Towards Differential Relational Privacy and its use in Question Answering
Figure 2 for Towards Differential Relational Privacy and its use in Question Answering
Figure 3 for Towards Differential Relational Privacy and its use in Question Answering
Figure 4 for Towards Differential Relational Privacy and its use in Question Answering
Viaarxiv icon

LaTr: Layout-Aware Transformer for Scene-Text VQA

Add code
Dec 24, 2021
Figure 1 for LaTr: Layout-Aware Transformer for Scene-Text VQA
Figure 2 for LaTr: Layout-Aware Transformer for Scene-Text VQA
Figure 3 for LaTr: Layout-Aware Transformer for Scene-Text VQA
Figure 4 for LaTr: Layout-Aware Transformer for Scene-Text VQA
Viaarxiv icon

TransFusion: Cross-view Fusion with Transformer for 3D Human Pose Estimation

Add code
Oct 29, 2021
Figure 1 for TransFusion: Cross-view Fusion with Transformer for 3D Human Pose Estimation
Figure 2 for TransFusion: Cross-view Fusion with Transformer for 3D Human Pose Estimation
Figure 3 for TransFusion: Cross-view Fusion with Transformer for 3D Human Pose Estimation
Figure 4 for TransFusion: Cross-view Fusion with Transformer for 3D Human Pose Estimation
Viaarxiv icon