Picture for Yu Ting Yeung

Yu Ting Yeung

ToneUnit: A Speech Discretization Approach for Tonal Language Speech Synthesis

Add code
Jun 13, 2024
Viaarxiv icon

Improving End-to-End Speech Processing by Efficient Text Data Utilization with Latent Synthesis

Add code
Oct 24, 2023
Figure 1 for Improving End-to-End Speech Processing by Efficient Text Data Utilization with Latent Synthesis
Figure 2 for Improving End-to-End Speech Processing by Efficient Text Data Utilization with Latent Synthesis
Figure 3 for Improving End-to-End Speech Processing by Efficient Text Data Utilization with Latent Synthesis
Figure 4 for Improving End-to-End Speech Processing by Efficient Text Data Utilization with Latent Synthesis
Viaarxiv icon

CorrectSpeech: A Fully Automated System for Speech Correction and Accent Reduction

Add code
Apr 12, 2022
Figure 1 for CorrectSpeech: A Fully Automated System for Speech Correction and Accent Reduction
Figure 2 for CorrectSpeech: A Fully Automated System for Speech Correction and Accent Reduction
Figure 3 for CorrectSpeech: A Fully Automated System for Speech Correction and Accent Reduction
Figure 4 for CorrectSpeech: A Fully Automated System for Speech Correction and Accent Reduction
Viaarxiv icon

SPIRAL: Self-supervised Perturbation-Invariant Representation Learning for Speech Pre-Training

Add code
Jan 29, 2022
Figure 1 for SPIRAL: Self-supervised Perturbation-Invariant Representation Learning for Speech Pre-Training
Figure 2 for SPIRAL: Self-supervised Perturbation-Invariant Representation Learning for Speech Pre-Training
Figure 3 for SPIRAL: Self-supervised Perturbation-Invariant Representation Learning for Speech Pre-Training
Figure 4 for SPIRAL: Self-supervised Perturbation-Invariant Representation Learning for Speech Pre-Training
Viaarxiv icon

Reducing language context confusion for end-to-end code-switching automatic speech recognition

Add code
Jan 28, 2022
Figure 1 for Reducing language context confusion for end-to-end code-switching automatic speech recognition
Figure 2 for Reducing language context confusion for end-to-end code-switching automatic speech recognition
Figure 3 for Reducing language context confusion for end-to-end code-switching automatic speech recognition
Figure 4 for Reducing language context confusion for end-to-end code-switching automatic speech recognition
Viaarxiv icon

CCA-MDD: A Coupled Cross-Attention based Framework for Streaming Mispronunciation detection and diagnosis

Add code
Nov 16, 2021
Figure 1 for CCA-MDD: A Coupled Cross-Attention based Framework for Streaming Mispronunciation detection and diagnosis
Figure 2 for CCA-MDD: A Coupled Cross-Attention based Framework for Streaming Mispronunciation detection and diagnosis
Figure 3 for CCA-MDD: A Coupled Cross-Attention based Framework for Streaming Mispronunciation detection and diagnosis
Figure 4 for CCA-MDD: A Coupled Cross-Attention based Framework for Streaming Mispronunciation detection and diagnosis
Viaarxiv icon

EditSpeech: A Text Based Speech Editing System Using Partial Inference and Bidirectional Fusion

Add code
Jul 04, 2021
Figure 1 for EditSpeech: A Text Based Speech Editing System Using Partial Inference and Bidirectional Fusion
Figure 2 for EditSpeech: A Text Based Speech Editing System Using Partial Inference and Bidirectional Fusion
Figure 3 for EditSpeech: A Text Based Speech Editing System Using Partial Inference and Bidirectional Fusion
Figure 4 for EditSpeech: A Text Based Speech Editing System Using Partial Inference and Bidirectional Fusion
Viaarxiv icon

VQMIVC: Vector Quantization and Mutual Information-Based Unsupervised Speech Representation Disentanglement for One-shot Voice Conversion

Add code
Jun 18, 2021
Figure 1 for VQMIVC: Vector Quantization and Mutual Information-Based Unsupervised Speech Representation Disentanglement for One-shot Voice Conversion
Figure 2 for VQMIVC: Vector Quantization and Mutual Information-Based Unsupervised Speech Representation Disentanglement for One-shot Voice Conversion
Figure 3 for VQMIVC: Vector Quantization and Mutual Information-Based Unsupervised Speech Representation Disentanglement for One-shot Voice Conversion
Figure 4 for VQMIVC: Vector Quantization and Mutual Information-Based Unsupervised Speech Representation Disentanglement for One-shot Voice Conversion
Viaarxiv icon

Unsupervised Domain Adaptation for Dysarthric Speech Detection via Domain Adversarial Training and Mutual Information Minimization

Add code
Jun 18, 2021
Figure 1 for Unsupervised Domain Adaptation for Dysarthric Speech Detection via Domain Adversarial Training and Mutual Information Minimization
Figure 2 for Unsupervised Domain Adaptation for Dysarthric Speech Detection via Domain Adversarial Training and Mutual Information Minimization
Figure 3 for Unsupervised Domain Adaptation for Dysarthric Speech Detection via Domain Adversarial Training and Mutual Information Minimization
Figure 4 for Unsupervised Domain Adaptation for Dysarthric Speech Detection via Domain Adversarial Training and Mutual Information Minimization
Viaarxiv icon

The HUAWEI Speaker Diarisation System for the VoxCeleb Speaker Diarisation Challenge

Add code
Oct 23, 2020
Figure 1 for The HUAWEI Speaker Diarisation System for the VoxCeleb Speaker Diarisation Challenge
Figure 2 for The HUAWEI Speaker Diarisation System for the VoxCeleb Speaker Diarisation Challenge
Figure 3 for The HUAWEI Speaker Diarisation System for the VoxCeleb Speaker Diarisation Challenge
Figure 4 for The HUAWEI Speaker Diarisation System for the VoxCeleb Speaker Diarisation Challenge
Viaarxiv icon