Picture for Zhongweiyang Xu

Zhongweiyang Xu

AVMeme Exam: A Multimodal Multilingual Multicultural Benchmark for LLMs' Contextual and Cultural Knowledge and Thinking

Add code
Jan 25, 2026
Viaarxiv icon

Contrastive Diffusion Guidance for Spatial Inverse Problems

Add code
Sep 30, 2025
Figure 1 for Contrastive Diffusion Guidance for Spatial Inverse Problems
Figure 2 for Contrastive Diffusion Guidance for Spatial Inverse Problems
Figure 3 for Contrastive Diffusion Guidance for Spatial Inverse Problems
Figure 4 for Contrastive Diffusion Guidance for Spatial Inverse Problems
Viaarxiv icon

ArrayDPS: Unsupervised Blind Speech Separation with a Diffusion Prior

Add code
May 17, 2025
Figure 1 for ArrayDPS: Unsupervised Blind Speech Separation with a Diffusion Prior
Figure 2 for ArrayDPS: Unsupervised Blind Speech Separation with a Diffusion Prior
Figure 3 for ArrayDPS: Unsupervised Blind Speech Separation with a Diffusion Prior
Figure 4 for ArrayDPS: Unsupervised Blind Speech Separation with a Diffusion Prior
Viaarxiv icon

Unsupervised Blind Speech Separation with a Diffusion Prior

Add code
May 08, 2025
Figure 1 for Unsupervised Blind Speech Separation with a Diffusion Prior
Figure 2 for Unsupervised Blind Speech Separation with a Diffusion Prior
Figure 3 for Unsupervised Blind Speech Separation with a Diffusion Prior
Figure 4 for Unsupervised Blind Speech Separation with a Diffusion Prior
Viaarxiv icon

Multi-Source Music Generation with Latent Diffusion

Add code
Sep 10, 2024
Figure 1 for Multi-Source Music Generation with Latent Diffusion
Figure 2 for Multi-Source Music Generation with Latent Diffusion
Figure 3 for Multi-Source Music Generation with Latent Diffusion
Figure 4 for Multi-Source Music Generation with Latent Diffusion
Viaarxiv icon

FoVNet: Configurable Field-of-View Speech Enhancement with Low Computation and Distortion for Smart Glasses

Add code
Aug 12, 2024
Figure 1 for FoVNet: Configurable Field-of-View Speech Enhancement with Low Computation and Distortion for Smart Glasses
Figure 2 for FoVNet: Configurable Field-of-View Speech Enhancement with Low Computation and Distortion for Smart Glasses
Figure 3 for FoVNet: Configurable Field-of-View Speech Enhancement with Low Computation and Distortion for Smart Glasses
Viaarxiv icon

uSee: Unified Speech Enhancement and Editing with Conditional Diffusion Models

Add code
Oct 02, 2023
Viaarxiv icon

Unifying Robustness and Fidelity: A Comprehensive Study of Pretrained Generative Methods for Speech Enhancement in Adverse Conditions

Add code
Sep 16, 2023
Figure 1 for Unifying Robustness and Fidelity: A Comprehensive Study of Pretrained Generative Methods for Speech Enhancement in Adverse Conditions
Figure 2 for Unifying Robustness and Fidelity: A Comprehensive Study of Pretrained Generative Methods for Speech Enhancement in Adverse Conditions
Figure 3 for Unifying Robustness and Fidelity: A Comprehensive Study of Pretrained Generative Methods for Speech Enhancement in Adverse Conditions
Figure 4 for Unifying Robustness and Fidelity: A Comprehensive Study of Pretrained Generative Methods for Speech Enhancement in Adverse Conditions
Viaarxiv icon

SpatialCodec: Neural Spatial Speech Coding

Add code
Sep 14, 2023
Figure 1 for SpatialCodec: Neural Spatial Speech Coding
Figure 2 for SpatialCodec: Neural Spatial Speech Coding
Figure 3 for SpatialCodec: Neural Spatial Speech Coding
Viaarxiv icon

Learning to Separate Voices by Spatial Regions

Add code
Jul 15, 2022
Figure 1 for Learning to Separate Voices by Spatial Regions
Figure 2 for Learning to Separate Voices by Spatial Regions
Figure 3 for Learning to Separate Voices by Spatial Regions
Figure 4 for Learning to Separate Voices by Spatial Regions
Viaarxiv icon