Alert button
Picture for Arent van Korlaar

Arent van Korlaar

Alert button

BASE TTS: Lessons from building a billion-parameter Text-to-Speech model on 100K hours of data

Add code
Bookmark button
Alert button
Feb 15, 2024
Mateusz Łajszczak, Guillermo Cámbara, Yang Li, Fatih Beyhan, Arent van Korlaar, Fan Yang, Arnaud Joly, Álvaro Martín-Cortinas, Ammar Abbas, Adam Michalski, Alexis Moinet, Sri Karlapati, Ewa Muszyńska, Haohan Guo, Bartosz Putrycz, Soledad López Gambino, Kayeon Yoo, Elena Sokolova, Thomas Drugman

Viaarxiv icon

Controllable Emphasis with zero data for text-to-speech

Add code
Bookmark button
Alert button
Jul 13, 2023
Arnaud Joly, Marco Nicolis, Ekaterina Peterova, Alessandro Lombardi, Ammar Abbas, Arent van Korlaar, Aman Hussain, Parul Sharma, Alexis Moinet, Mateusz Lajszczak, Penny Karanasou, Antonio Bonafonte, Thomas Drugman, Elena Sokolova

Figure 1 for Controllable Emphasis with zero data for text-to-speech
Figure 2 for Controllable Emphasis with zero data for text-to-speech
Figure 3 for Controllable Emphasis with zero data for text-to-speech
Figure 4 for Controllable Emphasis with zero data for text-to-speech
Viaarxiv icon

CopyCat2: A Single Model for Multi-Speaker TTS and Many-to-Many Fine-Grained Prosody Transfer

Add code
Bookmark button
Alert button
Jun 27, 2022
Sri Karlapati, Penny Karanasou, Mateusz Lajszczak, Ammar Abbas, Alexis Moinet, Peter Makarov, Ray Li, Arent van Korlaar, Simon Slangen, Thomas Drugman

Figure 1 for CopyCat2: A Single Model for Multi-Speaker TTS and Many-to-Many Fine-Grained Prosody Transfer
Figure 2 for CopyCat2: A Single Model for Multi-Speaker TTS and Many-to-Many Fine-Grained Prosody Transfer
Viaarxiv icon

Distribution augmentation for low-resource expressive text-to-speech

Add code
Bookmark button
Alert button
Feb 19, 2022
Mateusz Lajszczak, Animesh Prasad, Arent van Korlaar, Bajibabu Bollepalli, Antonio Bonafonte, Arnaud Joly, Marco Nicolis, Alexis Moinet, Thomas Drugman, Trevor Wood, Elena Sokolova

Figure 1 for Distribution augmentation for low-resource expressive text-to-speech
Figure 2 for Distribution augmentation for low-resource expressive text-to-speech
Figure 3 for Distribution augmentation for low-resource expressive text-to-speech
Figure 4 for Distribution augmentation for low-resource expressive text-to-speech
Viaarxiv icon