Picture for Chunlei Zhang

Chunlei Zhang

uSee: Unified Speech Enhancement and Editing with Conditional Diffusion Models

Add code
Oct 02, 2023
Figure 1 for uSee: Unified Speech Enhancement and Editing with Conditional Diffusion Models
Figure 2 for uSee: Unified Speech Enhancement and Editing with Conditional Diffusion Models
Figure 3 for uSee: Unified Speech Enhancement and Editing with Conditional Diffusion Models
Figure 4 for uSee: Unified Speech Enhancement and Editing with Conditional Diffusion Models
Viaarxiv icon

Unifying Robustness and Fidelity: A Comprehensive Study of Pretrained Generative Methods for Speech Enhancement in Adverse Conditions

Add code
Sep 16, 2023
Figure 1 for Unifying Robustness and Fidelity: A Comprehensive Study of Pretrained Generative Methods for Speech Enhancement in Adverse Conditions
Figure 2 for Unifying Robustness and Fidelity: A Comprehensive Study of Pretrained Generative Methods for Speech Enhancement in Adverse Conditions
Figure 3 for Unifying Robustness and Fidelity: A Comprehensive Study of Pretrained Generative Methods for Speech Enhancement in Adverse Conditions
Figure 4 for Unifying Robustness and Fidelity: A Comprehensive Study of Pretrained Generative Methods for Speech Enhancement in Adverse Conditions
Viaarxiv icon

Make-A-Voice: Unified Voice Synthesis With Discrete Representation

Add code
May 30, 2023
Figure 1 for Make-A-Voice: Unified Voice Synthesis With Discrete Representation
Figure 2 for Make-A-Voice: Unified Voice Synthesis With Discrete Representation
Figure 3 for Make-A-Voice: Unified Voice Synthesis With Discrete Representation
Figure 4 for Make-A-Voice: Unified Voice Synthesis With Discrete Representation
Viaarxiv icon

C3-DINO: Joint Contrastive and Non-contrastive Self-Supervised Learning for Speaker Verification

Add code
Aug 15, 2022
Figure 1 for C3-DINO: Joint Contrastive and Non-contrastive Self-Supervised Learning for Speaker Verification
Figure 2 for C3-DINO: Joint Contrastive and Non-contrastive Self-Supervised Learning for Speaker Verification
Figure 3 for C3-DINO: Joint Contrastive and Non-contrastive Self-Supervised Learning for Speaker Verification
Figure 4 for C3-DINO: Joint Contrastive and Non-contrastive Self-Supervised Learning for Speaker Verification
Viaarxiv icon

UTTS: Unsupervised TTS with Conditional Disentangled Sequential Variational Auto-encoder

Add code
Jun 07, 2022
Figure 1 for UTTS: Unsupervised TTS with Conditional Disentangled Sequential Variational Auto-encoder
Figure 2 for UTTS: Unsupervised TTS with Conditional Disentangled Sequential Variational Auto-encoder
Figure 3 for UTTS: Unsupervised TTS with Conditional Disentangled Sequential Variational Auto-encoder
Figure 4 for UTTS: Unsupervised TTS with Conditional Disentangled Sequential Variational Auto-encoder
Viaarxiv icon

LAE: Language-Aware Encoder for Monolingual and Multilingual ASR

Add code
Jun 05, 2022
Figure 1 for LAE: Language-Aware Encoder for Monolingual and Multilingual ASR
Figure 2 for LAE: Language-Aware Encoder for Monolingual and Multilingual ASR
Figure 3 for LAE: Language-Aware Encoder for Monolingual and Multilingual ASR
Figure 4 for LAE: Language-Aware Encoder for Monolingual and Multilingual ASR
Viaarxiv icon

NeuralEcho: A Self-Attentive Recurrent Neural Network For Unified Acoustic Echo Suppression And Speech Enhancement

Add code
May 20, 2022
Figure 1 for NeuralEcho: A Self-Attentive Recurrent Neural Network For Unified Acoustic Echo Suppression And Speech Enhancement
Figure 2 for NeuralEcho: A Self-Attentive Recurrent Neural Network For Unified Acoustic Echo Suppression And Speech Enhancement
Figure 3 for NeuralEcho: A Self-Attentive Recurrent Neural Network For Unified Acoustic Echo Suppression And Speech Enhancement
Figure 4 for NeuralEcho: A Self-Attentive Recurrent Neural Network For Unified Acoustic Echo Suppression And Speech Enhancement
Viaarxiv icon

Towards Improved Zero-shot Voice Conversion with Conditional DSVAE

Add code
May 11, 2022
Figure 1 for Towards Improved Zero-shot Voice Conversion with Conditional DSVAE
Figure 2 for Towards Improved Zero-shot Voice Conversion with Conditional DSVAE
Figure 3 for Towards Improved Zero-shot Voice Conversion with Conditional DSVAE
Figure 4 for Towards Improved Zero-shot Voice Conversion with Conditional DSVAE
Viaarxiv icon

EEND-SS: Joint End-to-End Neural Speaker Diarization and Speech Separation for Flexible Number of Speakers

Add code
Mar 31, 2022
Figure 1 for EEND-SS: Joint End-to-End Neural Speaker Diarization and Speech Separation for Flexible Number of Speakers
Figure 2 for EEND-SS: Joint End-to-End Neural Speaker Diarization and Speech Separation for Flexible Number of Speakers
Figure 3 for EEND-SS: Joint End-to-End Neural Speaker Diarization and Speech Separation for Flexible Number of Speakers
Figure 4 for EEND-SS: Joint End-to-End Neural Speaker Diarization and Speech Separation for Flexible Number of Speakers
Viaarxiv icon

Robust Disentangled Variational Speech Representation Learning for Zero-shot Voice Conversion

Add code
Mar 30, 2022
Figure 1 for Robust Disentangled Variational Speech Representation Learning for Zero-shot Voice Conversion
Figure 2 for Robust Disentangled Variational Speech Representation Learning for Zero-shot Voice Conversion
Figure 3 for Robust Disentangled Variational Speech Representation Learning for Zero-shot Voice Conversion
Figure 4 for Robust Disentangled Variational Speech Representation Learning for Zero-shot Voice Conversion
Viaarxiv icon