Picture for Yannan Wang

Yannan Wang

Joint Training or Not: An Exploration of Pre-trained Speech Models in Audio-Visual Speaker Diarization

Add code
Dec 07, 2023
Viaarxiv icon

The FlySpeech Audio-Visual Speaker Diarization System for MISP Challenge 2022

Add code
Jul 28, 2023
Figure 1 for The FlySpeech Audio-Visual Speaker Diarization System for MISP Challenge 2022
Figure 2 for The FlySpeech Audio-Visual Speaker Diarization System for MISP Challenge 2022
Figure 3 for The FlySpeech Audio-Visual Speaker Diarization System for MISP Challenge 2022
Viaarxiv icon

MC-SpEx: Towards Effective Speaker Extraction with Multi-Scale Interfusion and Conditional Speaker Modulation

Add code
Jun 28, 2023
Figure 1 for MC-SpEx: Towards Effective Speaker Extraction with Multi-Scale Interfusion and Conditional Speaker Modulation
Figure 2 for MC-SpEx: Towards Effective Speaker Extraction with Multi-Scale Interfusion and Conditional Speaker Modulation
Figure 3 for MC-SpEx: Towards Effective Speaker Extraction with Multi-Scale Interfusion and Conditional Speaker Modulation
Figure 4 for MC-SpEx: Towards Effective Speaker Extraction with Multi-Scale Interfusion and Conditional Speaker Modulation
Viaarxiv icon

Gesper: A Restoration-Enhancement Framework for General Speech Reconstruction

Add code
Jun 14, 2023
Figure 1 for Gesper: A Restoration-Enhancement Framework for General Speech Reconstruction
Figure 2 for Gesper: A Restoration-Enhancement Framework for General Speech Reconstruction
Figure 3 for Gesper: A Restoration-Enhancement Framework for General Speech Reconstruction
Figure 4 for Gesper: A Restoration-Enhancement Framework for General Speech Reconstruction
Viaarxiv icon

Inter-SubNet: Speech Enhancement with Subband Interaction

Add code
May 09, 2023
Figure 1 for Inter-SubNet: Speech Enhancement with Subband Interaction
Figure 2 for Inter-SubNet: Speech Enhancement with Subband Interaction
Figure 3 for Inter-SubNet: Speech Enhancement with Subband Interaction
Figure 4 for Inter-SubNet: Speech Enhancement with Subband Interaction
Viaarxiv icon

Distance-based Weight Transfer from Near-field to Far-field Speaker Verification

Add code
Mar 15, 2023
Figure 1 for Distance-based Weight Transfer from Near-field to Far-field Speaker Verification
Figure 2 for Distance-based Weight Transfer from Near-field to Far-field Speaker Verification
Figure 3 for Distance-based Weight Transfer from Near-field to Far-field Speaker Verification
Viaarxiv icon

TEA-PSE 3.0: Tencent-Ethereal-Audio-Lab Personalized Speech Enhancement System For ICASSP 2023 DNS Challenge

Add code
Mar 14, 2023
Figure 1 for TEA-PSE 3.0: Tencent-Ethereal-Audio-Lab Personalized Speech Enhancement System For ICASSP 2023 DNS Challenge
Figure 2 for TEA-PSE 3.0: Tencent-Ethereal-Audio-Lab Personalized Speech Enhancement System For ICASSP 2023 DNS Challenge
Figure 3 for TEA-PSE 3.0: Tencent-Ethereal-Audio-Lab Personalized Speech Enhancement System For ICASSP 2023 DNS Challenge
Viaarxiv icon

Speech Enhancement with Fullband-Subband Cross-Attention Network

Add code
Nov 10, 2022
Figure 1 for Speech Enhancement with Fullband-Subband Cross-Attention Network
Figure 2 for Speech Enhancement with Fullband-Subband Cross-Attention Network
Figure 3 for Speech Enhancement with Fullband-Subband Cross-Attention Network
Viaarxiv icon

Speech Enhancement with Intelligent Neural Homomorphic Synthesis

Add code
Oct 28, 2022
Figure 1 for Speech Enhancement with Intelligent Neural Homomorphic Synthesis
Figure 2 for Speech Enhancement with Intelligent Neural Homomorphic Synthesis
Figure 3 for Speech Enhancement with Intelligent Neural Homomorphic Synthesis
Figure 4 for Speech Enhancement with Intelligent Neural Homomorphic Synthesis
Viaarxiv icon

Local-global speaker representation for target speaker extraction

Add code
Oct 28, 2022
Figure 1 for Local-global speaker representation for target speaker extraction
Figure 2 for Local-global speaker representation for target speaker extraction
Figure 3 for Local-global speaker representation for target speaker extraction
Figure 4 for Local-global speaker representation for target speaker extraction
Viaarxiv icon