Picture for Jaime Lorenzo-Trueba

Jaime Lorenzo-Trueba

Enhancing audio quality for expressive Neural Text-to-Speech

Add code
Aug 13, 2021
Figure 1 for Enhancing audio quality for expressive Neural Text-to-Speech
Figure 2 for Enhancing audio quality for expressive Neural Text-to-Speech
Figure 3 for Enhancing audio quality for expressive Neural Text-to-Speech
Figure 4 for Enhancing audio quality for expressive Neural Text-to-Speech
Viaarxiv icon

Voicy: Zero-Shot Non-Parallel Voice Conversion in Noisy Reverberant Environments

Add code
Jun 16, 2021
Figure 1 for Voicy: Zero-Shot Non-Parallel Voice Conversion in Noisy Reverberant Environments
Figure 2 for Voicy: Zero-Shot Non-Parallel Voice Conversion in Noisy Reverberant Environments
Figure 3 for Voicy: Zero-Shot Non-Parallel Voice Conversion in Noisy Reverberant Environments
Figure 4 for Voicy: Zero-Shot Non-Parallel Voice Conversion in Noisy Reverberant Environments
Viaarxiv icon

Weakly-supervised word-level pronunciation error detection in non-native English speech

Add code
Jun 07, 2021
Figure 1 for Weakly-supervised word-level pronunciation error detection in non-native English speech
Figure 2 for Weakly-supervised word-level pronunciation error detection in non-native English speech
Figure 3 for Weakly-supervised word-level pronunciation error detection in non-native English speech
Figure 4 for Weakly-supervised word-level pronunciation error detection in non-native English speech
Viaarxiv icon

Proteno: Text Normalization with Limited Data for Fast Deployment in Text to Speech Systems

Add code
Apr 15, 2021
Figure 1 for Proteno: Text Normalization with Limited Data for Fast Deployment in Text to Speech Systems
Figure 2 for Proteno: Text Normalization with Limited Data for Fast Deployment in Text to Speech Systems
Figure 3 for Proteno: Text Normalization with Limited Data for Fast Deployment in Text to Speech Systems
Figure 4 for Proteno: Text Normalization with Limited Data for Fast Deployment in Text to Speech Systems
Viaarxiv icon

Mispronunciation Detection in Non-native (L2) English with Uncertainty Modeling

Add code
Feb 08, 2021
Figure 1 for Mispronunciation Detection in Non-native (L2) English with Uncertainty Modeling
Figure 2 for Mispronunciation Detection in Non-native (L2) English with Uncertainty Modeling
Figure 3 for Mispronunciation Detection in Non-native (L2) English with Uncertainty Modeling
Figure 4 for Mispronunciation Detection in Non-native (L2) English with Uncertainty Modeling
Viaarxiv icon

EmoCat: Language-agnostic Emotional Voice Conversion

Add code
Jan 14, 2021
Figure 1 for EmoCat: Language-agnostic Emotional Voice Conversion
Figure 2 for EmoCat: Language-agnostic Emotional Voice Conversion
Viaarxiv icon

Detection of Lexical Stress Errors in Non-native English with Data Augmentation and Attention

Add code
Dec 29, 2020
Figure 1 for Detection of Lexical Stress Errors in Non-native  English with Data Augmentation and Attention
Figure 2 for Detection of Lexical Stress Errors in Non-native  English with Data Augmentation and Attention
Figure 3 for Detection of Lexical Stress Errors in Non-native  English with Data Augmentation and Attention
Figure 4 for Detection of Lexical Stress Errors in Non-native  English with Data Augmentation and Attention
Viaarxiv icon

Voice Conversion for Whispered Speech Synthesis

Add code
Jan 17, 2020
Figure 1 for Voice Conversion for Whispered Speech Synthesis
Figure 2 for Voice Conversion for Whispered Speech Synthesis
Figure 3 for Voice Conversion for Whispered Speech Synthesis
Figure 4 for Voice Conversion for Whispered Speech Synthesis
Viaarxiv icon

Dynamic Prosody Generation for Speech Synthesis using Linguistics-Driven Acoustic Embedding Selection

Add code
Dec 02, 2019
Figure 1 for Dynamic Prosody Generation for Speech Synthesis using Linguistics-Driven Acoustic Embedding Selection
Figure 2 for Dynamic Prosody Generation for Speech Synthesis using Linguistics-Driven Acoustic Embedding Selection
Figure 3 for Dynamic Prosody Generation for Speech Synthesis using Linguistics-Driven Acoustic Embedding Selection
Figure 4 for Dynamic Prosody Generation for Speech Synthesis using Linguistics-Driven Acoustic Embedding Selection
Viaarxiv icon

Using VAEs and Normalizing Flows for One-shot Text-To-Speech Synthesis of Expressive Speech

Add code
Nov 28, 2019
Figure 1 for Using VAEs and Normalizing Flows for One-shot Text-To-Speech Synthesis of Expressive Speech
Figure 2 for Using VAEs and Normalizing Flows for One-shot Text-To-Speech Synthesis of Expressive Speech
Figure 3 for Using VAEs and Normalizing Flows for One-shot Text-To-Speech Synthesis of Expressive Speech
Figure 4 for Using VAEs and Normalizing Flows for One-shot Text-To-Speech Synthesis of Expressive Speech
Viaarxiv icon