Alert button
Picture for Alexis Moinet

Alexis Moinet

Alert button

BASE TTS: Lessons from building a billion-parameter Text-to-Speech model on 100K hours of data

Add code
Bookmark button
Alert button
Feb 15, 2024
Mateusz Łajszczak, Guillermo Cámbara, Yang Li, Fatih Beyhan, Arent van Korlaar, Fan Yang, Arnaud Joly, Álvaro Martín-Cortinas, Ammar Abbas, Adam Michalski, Alexis Moinet, Sri Karlapati, Ewa Muszyńska, Haohan Guo, Bartosz Putrycz, Soledad López Gambino, Kayeon Yoo, Elena Sokolova, Thomas Drugman

Viaarxiv icon

A Comparative Analysis of Pretrained Language Models for Text-to-Speech

Add code
Bookmark button
Alert button
Sep 04, 2023
Marcel Granero-Moya, Penny Karanasou, Sri Karlapati, Bastian Schnell, Nicole Peinelt, Alexis Moinet, Thomas Drugman

Figure 1 for A Comparative Analysis of Pretrained Language Models for Text-to-Speech
Figure 2 for A Comparative Analysis of Pretrained Language Models for Text-to-Speech
Figure 3 for A Comparative Analysis of Pretrained Language Models for Text-to-Speech
Figure 4 for A Comparative Analysis of Pretrained Language Models for Text-to-Speech
Viaarxiv icon

Controllable Emphasis with zero data for text-to-speech

Add code
Bookmark button
Alert button
Jul 13, 2023
Arnaud Joly, Marco Nicolis, Ekaterina Peterova, Alessandro Lombardi, Ammar Abbas, Arent van Korlaar, Aman Hussain, Parul Sharma, Alexis Moinet, Mateusz Lajszczak, Penny Karanasou, Antonio Bonafonte, Thomas Drugman, Elena Sokolova

Figure 1 for Controllable Emphasis with zero data for text-to-speech
Figure 2 for Controllable Emphasis with zero data for text-to-speech
Figure 3 for Controllable Emphasis with zero data for text-to-speech
Figure 4 for Controllable Emphasis with zero data for text-to-speech
Viaarxiv icon

eCat: An End-to-End Model for Multi-Speaker TTS & Many-to-Many Fine-Grained Prosody Transfer

Add code
Bookmark button
Alert button
Jun 20, 2023
Ammar Abbas, Sri Karlapati, Bastian Schnell, Penny Karanasou, Marcel Granero Moya, Amith Nagaraj, Ayman Boustati, Nicole Peinelt, Alexis Moinet, Thomas Drugman

Figure 1 for eCat: An End-to-End Model for Multi-Speaker TTS & Many-to-Many Fine-Grained Prosody Transfer
Figure 2 for eCat: An End-to-End Model for Multi-Speaker TTS & Many-to-Many Fine-Grained Prosody Transfer
Figure 3 for eCat: An End-to-End Model for Multi-Speaker TTS & Many-to-Many Fine-Grained Prosody Transfer
Figure 4 for eCat: An End-to-End Model for Multi-Speaker TTS & Many-to-Many Fine-Grained Prosody Transfer
Viaarxiv icon

Simple and Effective Multi-sentence TTS with Expressive and Coherent Prosody

Add code
Bookmark button
Alert button
Jun 29, 2022
Peter Makarov, Ammar Abbas, Mateusz Łajszczak, Arnaud Joly, Sri Karlapati, Alexis Moinet, Thomas Drugman, Penny Karanasou

Figure 1 for Simple and Effective Multi-sentence TTS with Expressive and Coherent Prosody
Figure 2 for Simple and Effective Multi-sentence TTS with Expressive and Coherent Prosody
Figure 3 for Simple and Effective Multi-sentence TTS with Expressive and Coherent Prosody
Figure 4 for Simple and Effective Multi-sentence TTS with Expressive and Coherent Prosody
Viaarxiv icon

Expressive, Variable, and Controllable Duration Modelling in TTS

Add code
Bookmark button
Alert button
Jun 28, 2022
Ammar Abbas, Thomas Merritt, Alexis Moinet, Sri Karlapati, Ewa Muszynska, Simon Slangen, Elia Gatti, Thomas Drugman

Figure 1 for Expressive, Variable, and Controllable Duration Modelling in TTS
Figure 2 for Expressive, Variable, and Controllable Duration Modelling in TTS
Figure 3 for Expressive, Variable, and Controllable Duration Modelling in TTS
Figure 4 for Expressive, Variable, and Controllable Duration Modelling in TTS
Viaarxiv icon

CopyCat2: A Single Model for Multi-Speaker TTS and Many-to-Many Fine-Grained Prosody Transfer

Add code
Bookmark button
Alert button
Jun 27, 2022
Sri Karlapati, Penny Karanasou, Mateusz Lajszczak, Ammar Abbas, Alexis Moinet, Peter Makarov, Ray Li, Arent van Korlaar, Simon Slangen, Thomas Drugman

Figure 1 for CopyCat2: A Single Model for Multi-Speaker TTS and Many-to-Many Fine-Grained Prosody Transfer
Figure 2 for CopyCat2: A Single Model for Multi-Speaker TTS and Many-to-Many Fine-Grained Prosody Transfer
Viaarxiv icon

Distribution augmentation for low-resource expressive text-to-speech

Add code
Bookmark button
Alert button
Feb 19, 2022
Mateusz Lajszczak, Animesh Prasad, Arent van Korlaar, Bajibabu Bollepalli, Antonio Bonafonte, Arnaud Joly, Marco Nicolis, Alexis Moinet, Thomas Drugman, Trevor Wood, Elena Sokolova

Figure 1 for Distribution augmentation for low-resource expressive text-to-speech
Figure 2 for Distribution augmentation for low-resource expressive text-to-speech
Figure 3 for Distribution augmentation for low-resource expressive text-to-speech
Figure 4 for Distribution augmentation for low-resource expressive text-to-speech
Viaarxiv icon