Alert button
Picture for Jaime Lorenzo-Trueba

Jaime Lorenzo-Trueba

Alert button

Enhancing the Stability of LLM-based Speech Generation Systems through Self-Supervised Representations

Add code
Bookmark button
Alert button
Feb 05, 2024
Álvaro Martín-Cortinas, Daniel Sáez-Trigueros, Iván Vallés-Pérez, Biel Tura-Vecino, Piotr Biliński, Mateusz Lajszczak, Grzegorz Beringer, Roberto Barra-Chicote, Jaime Lorenzo-Trueba

Viaarxiv icon

Multilingual context-based pronunciation learning for Text-to-Speech

Add code
Bookmark button
Alert button
Jul 31, 2023
Giulia Comini, Manuel Sam Ribeiro, Fan Yang, Heereen Shim, Jaime Lorenzo-Trueba

Figure 1 for Multilingual context-based pronunciation learning for Text-to-Speech
Figure 2 for Multilingual context-based pronunciation learning for Text-to-Speech
Figure 3 for Multilingual context-based pronunciation learning for Text-to-Speech
Figure 4 for Multilingual context-based pronunciation learning for Text-to-Speech
Viaarxiv icon

Comparing normalizing flows and diffusion models for prosody and acoustic modelling in text-to-speech

Add code
Bookmark button
Alert button
Jul 31, 2023
Guangyan Zhang, Thomas Merritt, Manuel Sam Ribeiro, Biel Tura-Vecino, Kayoko Yanagisawa, Kamil Pokora, Abdelhamid Ezzerg, Sebastian Cygert, Ammar Abbas, Piotr Bilinski, Roberto Barra-Chicote, Daniel Korzekwa, Jaime Lorenzo-Trueba

Figure 1 for Comparing normalizing flows and diffusion models for prosody and acoustic modelling in text-to-speech
Figure 2 for Comparing normalizing flows and diffusion models for prosody and acoustic modelling in text-to-speech
Figure 3 for Comparing normalizing flows and diffusion models for prosody and acoustic modelling in text-to-speech
Figure 4 for Comparing normalizing flows and diffusion models for prosody and acoustic modelling in text-to-speech
Viaarxiv icon

Improving grapheme-to-phoneme conversion by learning pronunciations from speech recordings

Add code
Bookmark button
Alert button
Jul 31, 2023
Manuel Sam Ribeiro, Giulia Comini, Jaime Lorenzo-Trueba

Figure 1 for Improving grapheme-to-phoneme conversion by learning pronunciations from speech recordings
Figure 2 for Improving grapheme-to-phoneme conversion by learning pronunciations from speech recordings
Figure 3 for Improving grapheme-to-phoneme conversion by learning pronunciations from speech recordings
Figure 4 for Improving grapheme-to-phoneme conversion by learning pronunciations from speech recordings
Viaarxiv icon

Low-data? No problem: low-resource, language-agnostic conversational text-to-speech via F0-conditioned data augmentation

Add code
Bookmark button
Alert button
Jul 29, 2022
Giulia Comini, Goeric Huybrechts, Manuel Sam Ribeiro, Adam Gabrys, Jaime Lorenzo-Trueba

Figure 1 for Low-data? No problem: low-resource, language-agnostic conversational text-to-speech via F0-conditioned data augmentation
Figure 2 for Low-data? No problem: low-resource, language-agnostic conversational text-to-speech via F0-conditioned data augmentation
Figure 3 for Low-data? No problem: low-resource, language-agnostic conversational text-to-speech via F0-conditioned data augmentation
Figure 4 for Low-data? No problem: low-resource, language-agnostic conversational text-to-speech via F0-conditioned data augmentation
Viaarxiv icon

Computer-assisted Pronunciation Training -- Speech synthesis is almost all you need

Add code
Bookmark button
Alert button
Jul 02, 2022
Daniel Korzekwa, Jaime Lorenzo-Trueba, Thomas Drugman, Bozena Kostek

Figure 1 for Computer-assisted Pronunciation Training -- Speech synthesis is almost all you need
Figure 2 for Computer-assisted Pronunciation Training -- Speech synthesis is almost all you need
Figure 3 for Computer-assisted Pronunciation Training -- Speech synthesis is almost all you need
Figure 4 for Computer-assisted Pronunciation Training -- Speech synthesis is almost all you need
Viaarxiv icon

Voice Filter: Few-shot text-to-speech speaker adaptation using voice conversion as a post-processing module

Add code
Bookmark button
Alert button
Feb 16, 2022
Adam Gabryś, Goeric Huybrechts, Manuel Sam Ribeiro, Chung-Ming Chien, Julian Roth, Giulia Comini, Roberto Barra-Chicote, Bartek Perz, Jaime Lorenzo-Trueba

Figure 1 for Voice Filter: Few-shot text-to-speech speaker adaptation using voice conversion as a post-processing module
Figure 2 for Voice Filter: Few-shot text-to-speech speaker adaptation using voice conversion as a post-processing module
Figure 3 for Voice Filter: Few-shot text-to-speech speaker adaptation using voice conversion as a post-processing module
Figure 4 for Voice Filter: Few-shot text-to-speech speaker adaptation using voice conversion as a post-processing module
Viaarxiv icon

Cross-speaker style transfer for text-to-speech using data augmentation

Add code
Bookmark button
Alert button
Feb 10, 2022
Manuel Sam Ribeiro, Julian Roth, Giulia Comini, Goeric Huybrechts, Adam Gabrys, Jaime Lorenzo-Trueba

Figure 1 for Cross-speaker style transfer for text-to-speech using data augmentation
Figure 2 for Cross-speaker style transfer for text-to-speech using data augmentation
Figure 3 for Cross-speaker style transfer for text-to-speech using data augmentation
Figure 4 for Cross-speaker style transfer for text-to-speech using data augmentation
Viaarxiv icon

Enhancing audio quality for expressive Neural Text-to-Speech

Add code
Bookmark button
Alert button
Aug 13, 2021
Abdelhamid Ezzerg, Adam Gabrys, Bartosz Putrycz, Daniel Korzekwa, Daniel Saez-Trigueros, David McHardy, Kamil Pokora, Jakub Lachowicz, Jaime Lorenzo-Trueba, Viacheslav Klimkov

Figure 1 for Enhancing audio quality for expressive Neural Text-to-Speech
Figure 2 for Enhancing audio quality for expressive Neural Text-to-Speech
Figure 3 for Enhancing audio quality for expressive Neural Text-to-Speech
Figure 4 for Enhancing audio quality for expressive Neural Text-to-Speech
Viaarxiv icon

Voicy: Zero-Shot Non-Parallel Voice Conversion in Noisy Reverberant Environments

Add code
Bookmark button
Alert button
Jun 16, 2021
Alejandro Mottini, Jaime Lorenzo-Trueba, Sri Vishnu Kumar Karlapati, Thomas Drugman

Figure 1 for Voicy: Zero-Shot Non-Parallel Voice Conversion in Noisy Reverberant Environments
Figure 2 for Voicy: Zero-Shot Non-Parallel Voice Conversion in Noisy Reverberant Environments
Figure 3 for Voicy: Zero-Shot Non-Parallel Voice Conversion in Noisy Reverberant Environments
Figure 4 for Voicy: Zero-Shot Non-Parallel Voice Conversion in Noisy Reverberant Environments
Viaarxiv icon