Alert button
Picture for Kentaro Tachibana

Kentaro Tachibana

Alert button

PromptTTS++: Controlling Speaker Identity in Prompt-Based Text-to-Speech Using Natural Language Descriptions

Add code
Bookmark button
Alert button
Sep 15, 2023
Reo Shimizu, Ryuichi Yamamoto, Masaya Kawamura, Yuma Shirahata, Hironori Doi, Tatsuya Komatsu, Kentaro Tachibana

Figure 1 for PromptTTS++: Controlling Speaker Identity in Prompt-Based Text-to-Speech Using Natural Language Descriptions
Figure 2 for PromptTTS++: Controlling Speaker Identity in Prompt-Based Text-to-Speech Using Natural Language Descriptions
Viaarxiv icon

ChatGPT-EDSS: Empathetic Dialogue Speech Synthesis Trained from ChatGPT-derived Context Word Embeddings

Add code
Bookmark button
Alert button
May 23, 2023
Yuki Saito, Shinnosuke Takamichi, Eiji Iimori, Kentaro Tachibana, Hiroshi Saruwatari

Figure 1 for ChatGPT-EDSS: Empathetic Dialogue Speech Synthesis Trained from ChatGPT-derived Context Word Embeddings
Figure 2 for ChatGPT-EDSS: Empathetic Dialogue Speech Synthesis Trained from ChatGPT-derived Context Word Embeddings
Figure 3 for ChatGPT-EDSS: Empathetic Dialogue Speech Synthesis Trained from ChatGPT-derived Context Word Embeddings
Figure 4 for ChatGPT-EDSS: Empathetic Dialogue Speech Synthesis Trained from ChatGPT-derived Context Word Embeddings
Viaarxiv icon

CALLS: Japanese Empathetic Dialogue Speech Corpus of Complaint Handling and Attentive Listening in Customer Center

Add code
Bookmark button
Alert button
May 23, 2023
Yuki Saito, Eiji Iimori, Shinnosuke Takamichi, Kentaro Tachibana, Hiroshi Saruwatari

Figure 1 for CALLS: Japanese Empathetic Dialogue Speech Corpus of Complaint Handling and Attentive Listening in Customer Center
Figure 2 for CALLS: Japanese Empathetic Dialogue Speech Corpus of Complaint Handling and Attentive Listening in Customer Center
Figure 3 for CALLS: Japanese Empathetic Dialogue Speech Corpus of Complaint Handling and Attentive Listening in Customer Center
Figure 4 for CALLS: Japanese Empathetic Dialogue Speech Corpus of Complaint Handling and Attentive Listening in Customer Center
Viaarxiv icon

Lightweight and High-Fidelity End-to-End Text-to-Speech with Multi-Band Generation and Inverse Short-Time Fourier Transform

Add code
Bookmark button
Alert button
Oct 28, 2022
Masaya Kawamura, Yuma Shirahata, Ryuichi Yamamoto, Kentaro Tachibana

Figure 1 for Lightweight and High-Fidelity End-to-End Text-to-Speech with Multi-Band Generation and Inverse Short-Time Fourier Transform
Figure 2 for Lightweight and High-Fidelity End-to-End Text-to-Speech with Multi-Band Generation and Inverse Short-Time Fourier Transform
Figure 3 for Lightweight and High-Fidelity End-to-End Text-to-Speech with Multi-Band Generation and Inverse Short-Time Fourier Transform
Viaarxiv icon

Period VITS: Variational Inference with Explicit Pitch Modeling for End-to-end Emotional Speech Synthesis

Add code
Bookmark button
Alert button
Oct 28, 2022
Yuma Shirahata, Ryuichi Yamamoto, Eunwoo Song, Ryo Terashima, Jae-Min Kim, Kentaro Tachibana

Figure 1 for Period VITS: Variational Inference with Explicit Pitch Modeling for End-to-end Emotional Speech Synthesis
Figure 2 for Period VITS: Variational Inference with Explicit Pitch Modeling for End-to-end Emotional Speech Synthesis
Figure 3 for Period VITS: Variational Inference with Explicit Pitch Modeling for End-to-end Emotional Speech Synthesis
Figure 4 for Period VITS: Variational Inference with Explicit Pitch Modeling for End-to-end Emotional Speech Synthesis
Viaarxiv icon

Nonparallel High-Quality Audio Super Resolution with Domain Adaptation and Resampling CycleGANs

Add code
Bookmark button
Alert button
Oct 28, 2022
Reo Yoneyama, Ryuichi Yamamoto, Kentaro Tachibana

Figure 1 for Nonparallel High-Quality Audio Super Resolution with Domain Adaptation and Resampling CycleGANs
Figure 2 for Nonparallel High-Quality Audio Super Resolution with Domain Adaptation and Resampling CycleGANs
Figure 3 for Nonparallel High-Quality Audio Super Resolution with Domain Adaptation and Resampling CycleGANs
Viaarxiv icon

Acoustic Modeling for End-to-End Empathetic Dialogue Speech Synthesis Using Linguistic and Prosodic Contexts of Dialogue History

Add code
Bookmark button
Alert button
Jun 16, 2022
Yuto Nishimura, Yuki Saito, Shinnosuke Takamichi, Kentaro Tachibana, Hiroshi Saruwatari

Figure 1 for Acoustic Modeling for End-to-End Empathetic Dialogue Speech Synthesis Using Linguistic and Prosodic Contexts of Dialogue History
Figure 2 for Acoustic Modeling for End-to-End Empathetic Dialogue Speech Synthesis Using Linguistic and Prosodic Contexts of Dialogue History
Figure 3 for Acoustic Modeling for End-to-End Empathetic Dialogue Speech Synthesis Using Linguistic and Prosodic Contexts of Dialogue History
Figure 4 for Acoustic Modeling for End-to-End Empathetic Dialogue Speech Synthesis Using Linguistic and Prosodic Contexts of Dialogue History
Viaarxiv icon

Cross-Speaker Emotion Transfer for Low-Resource Text-to-Speech Using Non-Parallel Voice Conversion with Pitch-Shift Data Augmentation

Add code
Bookmark button
Alert button
Apr 21, 2022
Ryo Terashima, Ryuichi Yamamoto, Eunwoo Song, Yuma Shirahata, Hyun-Wook Yoon, Jae-Min Kim, Kentaro Tachibana

Figure 1 for Cross-Speaker Emotion Transfer for Low-Resource Text-to-Speech Using Non-Parallel Voice Conversion with Pitch-Shift Data Augmentation
Figure 2 for Cross-Speaker Emotion Transfer for Low-Resource Text-to-Speech Using Non-Parallel Voice Conversion with Pitch-Shift Data Augmentation
Figure 3 for Cross-Speaker Emotion Transfer for Low-Resource Text-to-Speech Using Non-Parallel Voice Conversion with Pitch-Shift Data Augmentation
Figure 4 for Cross-Speaker Emotion Transfer for Low-Resource Text-to-Speech Using Non-Parallel Voice Conversion with Pitch-Shift Data Augmentation
Viaarxiv icon

DRSpeech: Degradation-Robust Text-to-Speech Synthesis with Frame-Level and Utterance-Level Acoustic Representation Learning

Add code
Bookmark button
Alert button
Mar 29, 2022
Takaaki Saeki, Kentaro Tachibana, Ryuichi Yamamoto

Figure 1 for DRSpeech: Degradation-Robust Text-to-Speech Synthesis with Frame-Level and Utterance-Level Acoustic Representation Learning
Figure 2 for DRSpeech: Degradation-Robust Text-to-Speech Synthesis with Frame-Level and Utterance-Level Acoustic Representation Learning
Figure 3 for DRSpeech: Degradation-Robust Text-to-Speech Synthesis with Frame-Level and Utterance-Level Acoustic Representation Learning
Figure 4 for DRSpeech: Degradation-Robust Text-to-Speech Synthesis with Frame-Level and Utterance-Level Acoustic Representation Learning
Viaarxiv icon

STUDIES: Corpus of Japanese Empathetic Dialogue Speech Towards Friendly Voice Agent

Add code
Bookmark button
Alert button
Mar 28, 2022
Yuki Saito, Yuto Nishimura, Shinnosuke Takamichi, Kentaro Tachibana, Hiroshi Saruwatari

Figure 1 for STUDIES: Corpus of Japanese Empathetic Dialogue Speech Towards Friendly Voice Agent
Figure 2 for STUDIES: Corpus of Japanese Empathetic Dialogue Speech Towards Friendly Voice Agent
Figure 3 for STUDIES: Corpus of Japanese Empathetic Dialogue Speech Towards Friendly Voice Agent
Figure 4 for STUDIES: Corpus of Japanese Empathetic Dialogue Speech Towards Friendly Voice Agent
Viaarxiv icon