Alert button
Picture for Ryuichi Yamamoto

Ryuichi Yamamoto

Alert button

A Comparative Study of Voice Conversion Models with Large-Scale Speech and Singing Data: The T13 Systems for the Singing Voice Conversion Challenge 2023

Add code
Bookmark button
Alert button
Oct 08, 2023
Ryuichi Yamamoto, Reo Yoneyama, Lester Phillip Violeta, Wen-Chin Huang, Tomoki Toda

Figure 1 for A Comparative Study of Voice Conversion Models with Large-Scale Speech and Singing Data: The T13 Systems for the Singing Voice Conversion Challenge 2023
Figure 2 for A Comparative Study of Voice Conversion Models with Large-Scale Speech and Singing Data: The T13 Systems for the Singing Voice Conversion Challenge 2023
Figure 3 for A Comparative Study of Voice Conversion Models with Large-Scale Speech and Singing Data: The T13 Systems for the Singing Voice Conversion Challenge 2023
Figure 4 for A Comparative Study of Voice Conversion Models with Large-Scale Speech and Singing Data: The T13 Systems for the Singing Voice Conversion Challenge 2023
Viaarxiv icon

Electrolaryngeal Speech Intelligibility Enhancement Through Robust Linguistic Encoders

Add code
Bookmark button
Alert button
Sep 18, 2023
Lester Phillip Violeta, Wen-Chin Huang, Ding Ma, Ryuichi Yamamoto, Kazuhiro Kobayashi, Tomoki Toda

Figure 1 for Electrolaryngeal Speech Intelligibility Enhancement Through Robust Linguistic Encoders
Figure 2 for Electrolaryngeal Speech Intelligibility Enhancement Through Robust Linguistic Encoders
Figure 3 for Electrolaryngeal Speech Intelligibility Enhancement Through Robust Linguistic Encoders
Figure 4 for Electrolaryngeal Speech Intelligibility Enhancement Through Robust Linguistic Encoders
Viaarxiv icon

PromptTTS++: Controlling Speaker Identity in Prompt-Based Text-to-Speech Using Natural Language Descriptions

Add code
Bookmark button
Alert button
Sep 15, 2023
Reo Shimizu, Ryuichi Yamamoto, Masaya Kawamura, Yuma Shirahata, Hironori Doi, Tatsuya Komatsu, Kentaro Tachibana

Figure 1 for PromptTTS++: Controlling Speaker Identity in Prompt-Based Text-to-Speech Using Natural Language Descriptions
Figure 2 for PromptTTS++: Controlling Speaker Identity in Prompt-Based Text-to-Speech Using Natural Language Descriptions
Viaarxiv icon

NNSVS: A Neural Network-Based Singing Voice Synthesis Toolkit

Add code
Bookmark button
Alert button
Oct 28, 2022
Ryuichi Yamamoto, Reo Yoneyama, Tomoki Toda

Figure 1 for NNSVS: A Neural Network-Based Singing Voice Synthesis Toolkit
Figure 2 for NNSVS: A Neural Network-Based Singing Voice Synthesis Toolkit
Viaarxiv icon

Lightweight and High-Fidelity End-to-End Text-to-Speech with Multi-Band Generation and Inverse Short-Time Fourier Transform

Add code
Bookmark button
Alert button
Oct 28, 2022
Masaya Kawamura, Yuma Shirahata, Ryuichi Yamamoto, Kentaro Tachibana

Figure 1 for Lightweight and High-Fidelity End-to-End Text-to-Speech with Multi-Band Generation and Inverse Short-Time Fourier Transform
Figure 2 for Lightweight and High-Fidelity End-to-End Text-to-Speech with Multi-Band Generation and Inverse Short-Time Fourier Transform
Figure 3 for Lightweight and High-Fidelity End-to-End Text-to-Speech with Multi-Band Generation and Inverse Short-Time Fourier Transform
Viaarxiv icon

Period VITS: Variational Inference with Explicit Pitch Modeling for End-to-end Emotional Speech Synthesis

Add code
Bookmark button
Alert button
Oct 28, 2022
Yuma Shirahata, Ryuichi Yamamoto, Eunwoo Song, Ryo Terashima, Jae-Min Kim, Kentaro Tachibana

Figure 1 for Period VITS: Variational Inference with Explicit Pitch Modeling for End-to-end Emotional Speech Synthesis
Figure 2 for Period VITS: Variational Inference with Explicit Pitch Modeling for End-to-end Emotional Speech Synthesis
Figure 3 for Period VITS: Variational Inference with Explicit Pitch Modeling for End-to-end Emotional Speech Synthesis
Figure 4 for Period VITS: Variational Inference with Explicit Pitch Modeling for End-to-end Emotional Speech Synthesis
Viaarxiv icon

Nonparallel High-Quality Audio Super Resolution with Domain Adaptation and Resampling CycleGANs

Add code
Bookmark button
Alert button
Oct 28, 2022
Reo Yoneyama, Ryuichi Yamamoto, Kentaro Tachibana

Figure 1 for Nonparallel High-Quality Audio Super Resolution with Domain Adaptation and Resampling CycleGANs
Figure 2 for Nonparallel High-Quality Audio Super Resolution with Domain Adaptation and Resampling CycleGANs
Figure 3 for Nonparallel High-Quality Audio Super Resolution with Domain Adaptation and Resampling CycleGANs
Viaarxiv icon

Language Model-Based Emotion Prediction Methods for Emotional Speech Synthesis Systems

Add code
Bookmark button
Alert button
Jul 01, 2022
Hyun-Wook Yoon, Ohsung Kwon, Hoyeon Lee, Ryuichi Yamamoto, Eunwoo Song, Jae-Min Kim, Min-Jae Hwang

Figure 1 for Language Model-Based Emotion Prediction Methods for Emotional Speech Synthesis Systems
Figure 2 for Language Model-Based Emotion Prediction Methods for Emotional Speech Synthesis Systems
Figure 3 for Language Model-Based Emotion Prediction Methods for Emotional Speech Synthesis Systems
Figure 4 for Language Model-Based Emotion Prediction Methods for Emotional Speech Synthesis Systems
Viaarxiv icon

TTS-by-TTS 2: Data-selective augmentation for neural speech synthesis using ranking support vector machine with variational autoencoder

Add code
Bookmark button
Alert button
Jun 30, 2022
Eunwoo Song, Ryuichi Yamamoto, Ohsung Kwon, Chan-Ho Song, Min-Jae Hwang, Suhyeon Oh, Hyun-Wook Yoon, Jin-Seob Kim, Jae-Min Kim

Figure 1 for TTS-by-TTS 2: Data-selective augmentation for neural speech synthesis using ranking support vector machine with variational autoencoder
Figure 2 for TTS-by-TTS 2: Data-selective augmentation for neural speech synthesis using ranking support vector machine with variational autoencoder
Figure 3 for TTS-by-TTS 2: Data-selective augmentation for neural speech synthesis using ranking support vector machine with variational autoencoder
Figure 4 for TTS-by-TTS 2: Data-selective augmentation for neural speech synthesis using ranking support vector machine with variational autoencoder
Viaarxiv icon