Picture for Yapeng Tian

Yapeng Tian

Semantic Grouping Network for Audio Source Separation

Add code
Jul 04, 2024
Figure 1 for Semantic Grouping Network for Audio Source Separation
Figure 2 for Semantic Grouping Network for Audio Source Separation
Figure 3 for Semantic Grouping Network for Audio Source Separation
Figure 4 for Semantic Grouping Network for Audio Source Separation
Viaarxiv icon

AV-DiT: Efficient Audio-Visual Diffusion Transformer for Joint Audio and Video Generation

Add code
Jun 11, 2024
Viaarxiv icon

MA-AVT: Modality Alignment for Parameter-Efficient Audio-Visual Transformers

Add code
Jun 07, 2024
Viaarxiv icon

Scaling Diffusion Mamba with Bidirectional SSMs for Efficient Image and Video Generation

Add code
May 24, 2024
Viaarxiv icon

SignLLM: Sign Languages Production Large Language Models

Add code
May 17, 2024
Viaarxiv icon

T-VSL: Text-Guided Visual Sound Source Localization in Mixtures

Add code
Apr 02, 2024
Viaarxiv icon

Robust Active Speaker Detection in Noisy Environments

Add code
Mar 30, 2024
Figure 1 for Robust Active Speaker Detection in Noisy Environments
Figure 2 for Robust Active Speaker Detection in Noisy Environments
Figure 3 for Robust Active Speaker Detection in Noisy Environments
Figure 4 for Robust Active Speaker Detection in Noisy Environments
Viaarxiv icon

Text-to-Audio Generation Synchronized with Videos

Add code
Mar 08, 2024
Figure 1 for Text-to-Audio Generation Synchronized with Videos
Figure 2 for Text-to-Audio Generation Synchronized with Videos
Figure 3 for Text-to-Audio Generation Synchronized with Videos
Figure 4 for Text-to-Audio Generation Synchronized with Videos
Viaarxiv icon

OSCaR: Object State Captioning and State Change Representation

Add code
Feb 28, 2024
Figure 1 for OSCaR: Object State Captioning and State Change Representation
Figure 2 for OSCaR: Object State Captioning and State Change Representation
Figure 3 for OSCaR: Object State Captioning and State Change Representation
Figure 4 for OSCaR: Object State Captioning and State Change Representation
Viaarxiv icon

Efficiently Leveraging Linguistic Priors for Scene Text Spotting

Add code
Feb 27, 2024
Figure 1 for Efficiently Leveraging Linguistic Priors for Scene Text Spotting
Figure 2 for Efficiently Leveraging Linguistic Priors for Scene Text Spotting
Figure 3 for Efficiently Leveraging Linguistic Priors for Scene Text Spotting
Figure 4 for Efficiently Leveraging Linguistic Priors for Scene Text Spotting
Viaarxiv icon