Picture for Joun Yeop Lee

Joun Yeop Lee

High Fidelity Text-to-Speech Via Discrete Tokens Using Token Transducer and Group Masked Language Model

Add code
Jun 25, 2024
Viaarxiv icon

Utilizing Neural Transducers for Two-Stage Text-to-Speech via Semantic Token Prediction

Add code
Jan 03, 2024
Viaarxiv icon

Efficient Parallel Audio Generation using Group Masked Language Modeling

Add code
Jan 02, 2024
Figure 1 for Efficient Parallel Audio Generation using Group Masked Language Modeling
Figure 2 for Efficient Parallel Audio Generation using Group Masked Language Modeling
Figure 3 for Efficient Parallel Audio Generation using Group Masked Language Modeling
Figure 4 for Efficient Parallel Audio Generation using Group Masked Language Modeling
Viaarxiv icon

Latent Filling: Latent Space Data Augmentation for Zero-shot Speech Synthesis

Add code
Oct 05, 2023
Figure 1 for Latent Filling: Latent Space Data Augmentation for Zero-shot Speech Synthesis
Figure 2 for Latent Filling: Latent Space Data Augmentation for Zero-shot Speech Synthesis
Figure 3 for Latent Filling: Latent Space Data Augmentation for Zero-shot Speech Synthesis
Figure 4 for Latent Filling: Latent Space Data Augmentation for Zero-shot Speech Synthesis
Viaarxiv icon

SNAC: Speaker-normalized affine coupling layer in flow-based architecture for zero-shot multi-speaker text-to-speech

Add code
Nov 30, 2022
Figure 1 for SNAC: Speaker-normalized affine coupling layer in flow-based architecture for zero-shot multi-speaker text-to-speech
Figure 2 for SNAC: Speaker-normalized affine coupling layer in flow-based architecture for zero-shot multi-speaker text-to-speech
Viaarxiv icon

An Empirical Study on L2 Accents of Cross-lingual Text-to-Speech Systems via Vowel Space

Add code
Nov 06, 2022
Figure 1 for An Empirical Study on L2 Accents of Cross-lingual Text-to-Speech Systems via Vowel Space
Figure 2 for An Empirical Study on L2 Accents of Cross-lingual Text-to-Speech Systems via Vowel Space
Figure 3 for An Empirical Study on L2 Accents of Cross-lingual Text-to-Speech Systems via Vowel Space
Figure 4 for An Empirical Study on L2 Accents of Cross-lingual Text-to-Speech Systems via Vowel Space
Viaarxiv icon

Into-TTS : Intonation Template based Prosody Control System

Add code
Apr 04, 2022
Figure 1 for Into-TTS : Intonation Template based Prosody Control System
Figure 2 for Into-TTS : Intonation Template based Prosody Control System
Figure 3 for Into-TTS : Intonation Template based Prosody Control System
Figure 4 for Into-TTS : Intonation Template based Prosody Control System
Viaarxiv icon

Transfer Learning Framework for Low-Resource Text-to-Speech using a Large-Scale Unlabeled Speech Corpus

Add code
Mar 29, 2022
Figure 1 for Transfer Learning Framework for Low-Resource Text-to-Speech using a Large-Scale Unlabeled Speech Corpus
Figure 2 for Transfer Learning Framework for Low-Resource Text-to-Speech using a Large-Scale Unlabeled Speech Corpus
Figure 3 for Transfer Learning Framework for Low-Resource Text-to-Speech using a Large-Scale Unlabeled Speech Corpus
Viaarxiv icon

SoftFlow: Probabilistic Framework for Normalizing Flow on Manifolds

Add code
Jun 09, 2020
Figure 1 for SoftFlow: Probabilistic Framework for Normalizing Flow on Manifolds
Figure 2 for SoftFlow: Probabilistic Framework for Normalizing Flow on Manifolds
Figure 3 for SoftFlow: Probabilistic Framework for Normalizing Flow on Manifolds
Figure 4 for SoftFlow: Probabilistic Framework for Normalizing Flow on Manifolds
Viaarxiv icon