Picture for Roberto Barra-Chicote

Roberto Barra-Chicote

Enhancing the Stability of LLM-based Speech Generation Systems through Self-Supervised Representations

Add code
Feb 05, 2024
Figure 1 for Enhancing the Stability of LLM-based Speech Generation Systems through Self-Supervised Representations
Figure 2 for Enhancing the Stability of LLM-based Speech Generation Systems through Self-Supervised Representations
Figure 3 for Enhancing the Stability of LLM-based Speech Generation Systems through Self-Supervised Representations
Viaarxiv icon

Creating New Voices using Normalizing Flows

Add code
Dec 22, 2023
Figure 1 for Creating New Voices using Normalizing Flows
Figure 2 for Creating New Voices using Normalizing Flows
Figure 3 for Creating New Voices using Normalizing Flows
Figure 4 for Creating New Voices using Normalizing Flows
Viaarxiv icon

Comparing normalizing flows and diffusion models for prosody and acoustic modelling in text-to-speech

Add code
Jul 31, 2023
Figure 1 for Comparing normalizing flows and diffusion models for prosody and acoustic modelling in text-to-speech
Figure 2 for Comparing normalizing flows and diffusion models for prosody and acoustic modelling in text-to-speech
Figure 3 for Comparing normalizing flows and diffusion models for prosody and acoustic modelling in text-to-speech
Figure 4 for Comparing normalizing flows and diffusion models for prosody and acoustic modelling in text-to-speech
Viaarxiv icon

SCRAPS: Speech Contrastive Representations of Acoustic and Phonetic Spaces

Add code
Jul 23, 2023
Figure 1 for SCRAPS: Speech Contrastive Representations of Acoustic and Phonetic Spaces
Figure 2 for SCRAPS: Speech Contrastive Representations of Acoustic and Phonetic Spaces
Figure 3 for SCRAPS: Speech Contrastive Representations of Acoustic and Phonetic Spaces
Figure 4 for SCRAPS: Speech Contrastive Representations of Acoustic and Phonetic Spaces
Viaarxiv icon

Remap, warp and attend: Non-parallel many-to-many accent conversion with Normalizing Flows

Add code
Nov 10, 2022
Figure 1 for Remap, warp and attend: Non-parallel many-to-many accent conversion with Normalizing Flows
Figure 2 for Remap, warp and attend: Non-parallel many-to-many accent conversion with Normalizing Flows
Figure 3 for Remap, warp and attend: Non-parallel many-to-many accent conversion with Normalizing Flows
Figure 4 for Remap, warp and attend: Non-parallel many-to-many accent conversion with Normalizing Flows
Viaarxiv icon

Stutter-TTS: Controlled Synthesis and Improved Recognition of Stuttered Speech

Add code
Nov 04, 2022
Figure 1 for Stutter-TTS: Controlled Synthesis and Improved Recognition of Stuttered Speech
Figure 2 for Stutter-TTS: Controlled Synthesis and Improved Recognition of Stuttered Speech
Figure 3 for Stutter-TTS: Controlled Synthesis and Improved Recognition of Stuttered Speech
Figure 4 for Stutter-TTS: Controlled Synthesis and Improved Recognition of Stuttered Speech
Viaarxiv icon

GlowVC: Mel-spectrogram space disentangling model for language-independent text-free voice conversion

Add code
Jul 04, 2022
Figure 1 for GlowVC: Mel-spectrogram space disentangling model for language-independent text-free voice conversion
Figure 2 for GlowVC: Mel-spectrogram space disentangling model for language-independent text-free voice conversion
Figure 3 for GlowVC: Mel-spectrogram space disentangling model for language-independent text-free voice conversion
Figure 4 for GlowVC: Mel-spectrogram space disentangling model for language-independent text-free voice conversion
Viaarxiv icon

Prosodic Alignment for off-screen automatic dubbing

Add code
Apr 06, 2022
Figure 1 for Prosodic Alignment for off-screen automatic dubbing
Figure 2 for Prosodic Alignment for off-screen automatic dubbing
Figure 3 for Prosodic Alignment for off-screen automatic dubbing
Figure 4 for Prosodic Alignment for off-screen automatic dubbing
Viaarxiv icon

Text-free non-parallel many-to-many voice conversion using normalising flows

Add code
Mar 15, 2022
Figure 1 for Text-free non-parallel many-to-many voice conversion using normalising flows
Figure 2 for Text-free non-parallel many-to-many voice conversion using normalising flows
Figure 3 for Text-free non-parallel many-to-many voice conversion using normalising flows
Figure 4 for Text-free non-parallel many-to-many voice conversion using normalising flows
Viaarxiv icon

Voice Filter: Few-shot text-to-speech speaker adaptation using voice conversion as a post-processing module

Add code
Feb 16, 2022
Figure 1 for Voice Filter: Few-shot text-to-speech speaker adaptation using voice conversion as a post-processing module
Figure 2 for Voice Filter: Few-shot text-to-speech speaker adaptation using voice conversion as a post-processing module
Figure 3 for Voice Filter: Few-shot text-to-speech speaker adaptation using voice conversion as a post-processing module
Figure 4 for Voice Filter: Few-shot text-to-speech speaker adaptation using voice conversion as a post-processing module
Viaarxiv icon