Alert button
Picture for Sang-Hoon Lee

Sang-Hoon Lee

Alert button

TranSentence: Speech-to-speech Translation via Language-agnostic Sentence-level Speech Encoding without Language-parallel Data

Add code
Bookmark button
Alert button
Jan 17, 2024
Seung-Bin Kim, Sang-Hoon Lee, Seong-Whan Lee

Viaarxiv icon

DurFlex-EVC: Duration-Flexible Emotional Voice Conversion with Parallel Generation

Add code
Bookmark button
Alert button
Jan 16, 2024
Hyoung-Seok Oh, Sang-Hoon Lee, Deok-Hyun Cho, Seong-Whan Lee

Viaarxiv icon

HierSpeech++: Bridging the Gap between Semantic and Acoustic Representation of Speech by Hierarchical Variational Inference for Zero-shot Speech Synthesis

Add code
Bookmark button
Alert button
Nov 27, 2023
Sang-Hoon Lee, Ha-Yeong Choi, Seung-Bin Kim, Seong-Whan Lee

Figure 1 for HierSpeech++: Bridging the Gap between Semantic and Acoustic Representation of Speech by Hierarchical Variational Inference for Zero-shot Speech Synthesis
Figure 2 for HierSpeech++: Bridging the Gap between Semantic and Acoustic Representation of Speech by Hierarchical Variational Inference for Zero-shot Speech Synthesis
Figure 3 for HierSpeech++: Bridging the Gap between Semantic and Acoustic Representation of Speech by Hierarchical Variational Inference for Zero-shot Speech Synthesis
Figure 4 for HierSpeech++: Bridging the Gap between Semantic and Acoustic Representation of Speech by Hierarchical Variational Inference for Zero-shot Speech Synthesis
Viaarxiv icon

Diff-HierVC: Diffusion-based Hierarchical Voice Conversion with Robust Pitch Generation and Masked Prior for Zero-shot Speaker Adaptation

Add code
Bookmark button
Alert button
Nov 08, 2023
Ha-Yeong Choi, Sang-Hoon Lee, Seong-Whan Lee

Figure 1 for Diff-HierVC: Diffusion-based Hierarchical Voice Conversion with Robust Pitch Generation and Masked Prior for Zero-shot Speaker Adaptation
Figure 2 for Diff-HierVC: Diffusion-based Hierarchical Voice Conversion with Robust Pitch Generation and Masked Prior for Zero-shot Speaker Adaptation
Figure 3 for Diff-HierVC: Diffusion-based Hierarchical Voice Conversion with Robust Pitch Generation and Masked Prior for Zero-shot Speaker Adaptation
Figure 4 for Diff-HierVC: Diffusion-based Hierarchical Voice Conversion with Robust Pitch Generation and Masked Prior for Zero-shot Speaker Adaptation
Viaarxiv icon

DiffProsody: Diffusion-based Latent Prosody Generation for Expressive Speech Synthesis with Prosody Conditional Adversarial Training

Add code
Bookmark button
Alert button
Jul 31, 2023
Hyung-Seok Oh, Sang-Hoon Lee, Seong-Whan Lee

Figure 1 for DiffProsody: Diffusion-based Latent Prosody Generation for Expressive Speech Synthesis with Prosody Conditional Adversarial Training
Figure 2 for DiffProsody: Diffusion-based Latent Prosody Generation for Expressive Speech Synthesis with Prosody Conditional Adversarial Training
Figure 3 for DiffProsody: Diffusion-based Latent Prosody Generation for Expressive Speech Synthesis with Prosody Conditional Adversarial Training
Figure 4 for DiffProsody: Diffusion-based Latent Prosody Generation for Expressive Speech Synthesis with Prosody Conditional Adversarial Training
Viaarxiv icon

HierVST: Hierarchical Adaptive Zero-shot Voice Style Transfer

Add code
Bookmark button
Alert button
Jul 30, 2023
Sang-Hoon Lee, Ha-Yeong Choi, Hyung-Seok Oh, Seong-Whan Lee

Figure 1 for HierVST: Hierarchical Adaptive Zero-shot Voice Style Transfer
Figure 2 for HierVST: Hierarchical Adaptive Zero-shot Voice Style Transfer
Figure 3 for HierVST: Hierarchical Adaptive Zero-shot Voice Style Transfer
Figure 4 for HierVST: Hierarchical Adaptive Zero-shot Voice Style Transfer
Viaarxiv icon

PauseSpeech: Natural Speech Synthesis via Pre-trained Language Model and Pause-based Prosody Modeling

Add code
Bookmark button
Alert button
Jun 13, 2023
Ji-Sang Hwang, Sang-Hoon Lee, Seong-Whan Lee

Figure 1 for PauseSpeech: Natural Speech Synthesis via Pre-trained Language Model and Pause-based Prosody Modeling
Figure 2 for PauseSpeech: Natural Speech Synthesis via Pre-trained Language Model and Pause-based Prosody Modeling
Figure 3 for PauseSpeech: Natural Speech Synthesis via Pre-trained Language Model and Pause-based Prosody Modeling
Figure 4 for PauseSpeech: Natural Speech Synthesis via Pre-trained Language Model and Pause-based Prosody Modeling
Viaarxiv icon

HiddenSinger: High-Quality Singing Voice Synthesis via Neural Audio Codec and Latent Diffusion Models

Add code
Bookmark button
Alert button
Jun 12, 2023
Ji-Sang Hwang, Sang-Hoon Lee, Seong-Whan Lee

Figure 1 for HiddenSinger: High-Quality Singing Voice Synthesis via Neural Audio Codec and Latent Diffusion Models
Figure 2 for HiddenSinger: High-Quality Singing Voice Synthesis via Neural Audio Codec and Latent Diffusion Models
Figure 3 for HiddenSinger: High-Quality Singing Voice Synthesis via Neural Audio Codec and Latent Diffusion Models
Figure 4 for HiddenSinger: High-Quality Singing Voice Synthesis via Neural Audio Codec and Latent Diffusion Models
Viaarxiv icon

DDDM-VC: Decoupled Denoising Diffusion Models with Disentangled Representation and Prior Mixup for Verified Robust Voice Conversion

Add code
Bookmark button
Alert button
May 25, 2023
Ha-Yeong Choi, Sang-Hoon Lee, Seong-Whan Lee

Figure 1 for DDDM-VC: Decoupled Denoising Diffusion Models with Disentangled Representation and Prior Mixup for Verified Robust Voice Conversion
Figure 2 for DDDM-VC: Decoupled Denoising Diffusion Models with Disentangled Representation and Prior Mixup for Verified Robust Voice Conversion
Figure 3 for DDDM-VC: Decoupled Denoising Diffusion Models with Disentangled Representation and Prior Mixup for Verified Robust Voice Conversion
Figure 4 for DDDM-VC: Decoupled Denoising Diffusion Models with Disentangled Representation and Prior Mixup for Verified Robust Voice Conversion
Viaarxiv icon