Alert button
Picture for Haohan Guo

Haohan Guo

Alert button

BASE TTS: Lessons from building a billion-parameter Text-to-Speech model on 100K hours of data

Add code
Bookmark button
Alert button
Feb 15, 2024
Mateusz Łajszczak, Guillermo Cámbara, Yang Li, Fatih Beyhan, Arent van Korlaar, Fan Yang, Arnaud Joly, Álvaro Martín-Cortinas, Ammar Abbas, Adam Michalski, Alexis Moinet, Sri Karlapati, Ewa Muszyńska, Haohan Guo, Bartosz Putrycz, Soledad López Gambino, Kayeon Yoo, Elena Sokolova, Thomas Drugman

Viaarxiv icon

Cross-Speaker Encoding Network for Multi-Talker Speech Recognition

Add code
Bookmark button
Alert button
Jan 08, 2024
Jiawen Kang, Lingwei Meng, Mingyu Cui, Haohan Guo, Xixin Wu, Xunying Liu, Helen Meng

Viaarxiv icon

QS-TTS: Towards Semi-Supervised Text-to-Speech Synthesis via Vector-Quantized Self-Supervised Speech Representation Learning

Add code
Bookmark button
Alert button
Aug 31, 2023
Haohan Guo, Fenglong Xie, Jiawen Kang, Yujia Xiao, Xixin Wu, Helen Meng

Figure 1 for QS-TTS: Towards Semi-Supervised Text-to-Speech Synthesis via Vector-Quantized Self-Supervised Speech Representation Learning
Figure 2 for QS-TTS: Towards Semi-Supervised Text-to-Speech Synthesis via Vector-Quantized Self-Supervised Speech Representation Learning
Figure 3 for QS-TTS: Towards Semi-Supervised Text-to-Speech Synthesis via Vector-Quantized Self-Supervised Speech Representation Learning
Figure 4 for QS-TTS: Towards Semi-Supervised Text-to-Speech Synthesis via Vector-Quantized Self-Supervised Speech Representation Learning
Viaarxiv icon

Towards High-Quality Neural TTS for Low-Resource Languages by Learning Compact Speech Representations

Add code
Bookmark button
Alert button
Oct 27, 2022
Haohan Guo, Fenglong Xie, Xixin Wu, Hui Lu, Helen Meng

Figure 1 for Towards High-Quality Neural TTS for Low-Resource Languages by Learning Compact Speech Representations
Figure 2 for Towards High-Quality Neural TTS for Low-Resource Languages by Learning Compact Speech Representations
Figure 3 for Towards High-Quality Neural TTS for Low-Resource Languages by Learning Compact Speech Representations
Figure 4 for Towards High-Quality Neural TTS for Low-Resource Languages by Learning Compact Speech Representations
Viaarxiv icon

A Multi-Stage Multi-Codebook VQ-VAE Approach to High-Performance Neural TTS

Add code
Bookmark button
Alert button
Sep 22, 2022
Haohan Guo, Fenglong Xie, Frank K. Soong, Xixin Wu, Helen Meng

Figure 1 for A Multi-Stage Multi-Codebook VQ-VAE Approach to High-Performance Neural TTS
Figure 2 for A Multi-Stage Multi-Codebook VQ-VAE Approach to High-Performance Neural TTS
Figure 3 for A Multi-Stage Multi-Codebook VQ-VAE Approach to High-Performance Neural TTS
Figure 4 for A Multi-Stage Multi-Codebook VQ-VAE Approach to High-Performance Neural TTS
Viaarxiv icon

A Multi-Scale Time-Frequency Spectrogram Discriminator for GAN-based Non-Autoregressive TTS

Add code
Bookmark button
Alert button
Mar 22, 2022
Haohan Guo, Hui Lu, Xixin Wu, Helen Meng

Figure 1 for A Multi-Scale Time-Frequency Spectrogram Discriminator for GAN-based Non-Autoregressive TTS
Figure 2 for A Multi-Scale Time-Frequency Spectrogram Discriminator for GAN-based Non-Autoregressive TTS
Figure 3 for A Multi-Scale Time-Frequency Spectrogram Discriminator for GAN-based Non-Autoregressive TTS
Figure 4 for A Multi-Scale Time-Frequency Spectrogram Discriminator for GAN-based Non-Autoregressive TTS
Viaarxiv icon

Improving Adversarial Waveform Generation based Singing Voice Conversion with Harmonic Signals

Add code
Bookmark button
Alert button
Jan 25, 2022
Haohan Guo, Zhiping Zhou, Fanbo Meng, Kai Liu

Figure 1 for Improving Adversarial Waveform Generation based Singing Voice Conversion with Harmonic Signals
Figure 2 for Improving Adversarial Waveform Generation based Singing Voice Conversion with Harmonic Signals
Figure 3 for Improving Adversarial Waveform Generation based Singing Voice Conversion with Harmonic Signals
Figure 4 for Improving Adversarial Waveform Generation based Singing Voice Conversion with Harmonic Signals
Viaarxiv icon

Phonetic Posteriorgrams based Many-to-Many Singing Voice Conversion via Adversarial Training

Add code
Bookmark button
Alert button
Dec 03, 2020
Haohan Guo, Heng Lu, Na Hu, Chunlei Zhang, Shan Yang, Lei Xie, Dan Su, Dong Yu

Figure 1 for Phonetic Posteriorgrams based Many-to-Many Singing Voice Conversion via Adversarial Training
Figure 2 for Phonetic Posteriorgrams based Many-to-Many Singing Voice Conversion via Adversarial Training
Figure 3 for Phonetic Posteriorgrams based Many-to-Many Singing Voice Conversion via Adversarial Training
Figure 4 for Phonetic Posteriorgrams based Many-to-Many Singing Voice Conversion via Adversarial Training
Viaarxiv icon