Alert button
Picture for Takaaki Saeki

Takaaki Saeki

Alert button

Extending Multilingual Speech Synthesis to 100+ Languages without Transcribed Data

Add code
Bookmark button
Alert button
Feb 29, 2024
Takaaki Saeki, Gary Wang, Nobuyuki Morioka, Isaac Elias, Kyle Kastner, Andrew Rosenberg, Bhuvana Ramabhadran, Heiga Zen, Françoise Beaufays, Hadar Shemtov

Viaarxiv icon

SpeechBERTScore: Reference-Aware Automatic Evaluation of Speech Generation Leveraging NLP Evaluation Metrics

Add code
Bookmark button
Alert button
Jan 30, 2024
Takaaki Saeki, Soumi Maiti, Shinnosuke Takamichi, Shinji Watanabe, Hiroshi Saruwatari

Viaarxiv icon

Diversity-based core-set selection for text-to-speech with linguistic and acoustic features

Add code
Bookmark button
Alert button
Sep 15, 2023
Kentaro Seki, Shinnosuke Takamichi, Takaaki Saeki, Hiroshi Saruwatari

Figure 1 for Diversity-based core-set selection for text-to-speech with linguistic and acoustic features
Figure 2 for Diversity-based core-set selection for text-to-speech with linguistic and acoustic features
Figure 3 for Diversity-based core-set selection for text-to-speech with linguistic and acoustic features
Figure 4 for Diversity-based core-set selection for text-to-speech with linguistic and acoustic features
Viaarxiv icon

Duration-aware pause insertion using pre-trained language model for multi-speaker text-to-speech

Add code
Bookmark button
Alert button
Feb 27, 2023
Dong Yang, Tomoki Koriyama, Yuki Saito, Takaaki Saeki, Detai Xin, Hiroshi Saruwatari

Figure 1 for Duration-aware pause insertion using pre-trained language model for multi-speaker text-to-speech
Figure 2 for Duration-aware pause insertion using pre-trained language model for multi-speaker text-to-speech
Figure 3 for Duration-aware pause insertion using pre-trained language model for multi-speaker text-to-speech
Figure 4 for Duration-aware pause insertion using pre-trained language model for multi-speaker text-to-speech
Viaarxiv icon

Learning to Speak from Text: Zero-Shot Multilingual Text-to-Speech with Unsupervised Text Pretraining

Add code
Bookmark button
Alert button
Feb 05, 2023
Takaaki Saeki, Soumi Maiti, Xinjian Li, Shinji Watanabe, Shinnosuke Takamichi, Hiroshi Saruwatari

Figure 1 for Learning to Speak from Text: Zero-Shot Multilingual Text-to-Speech with Unsupervised Text Pretraining
Figure 2 for Learning to Speak from Text: Zero-Shot Multilingual Text-to-Speech with Unsupervised Text Pretraining
Figure 3 for Learning to Speak from Text: Zero-Shot Multilingual Text-to-Speech with Unsupervised Text Pretraining
Figure 4 for Learning to Speak from Text: Zero-Shot Multilingual Text-to-Speech with Unsupervised Text Pretraining
Viaarxiv icon

SpeechLMScore: Evaluating speech generation using speech language model

Add code
Bookmark button
Alert button
Dec 08, 2022
Soumi Maiti, Yifan Peng, Takaaki Saeki, Shinji Watanabe

Figure 1 for SpeechLMScore: Evaluating speech generation using speech language model
Figure 2 for SpeechLMScore: Evaluating speech generation using speech language model
Figure 3 for SpeechLMScore: Evaluating speech generation using speech language model
Figure 4 for SpeechLMScore: Evaluating speech generation using speech language model
Viaarxiv icon

Virtuoso: Massive Multilingual Speech-Text Joint Semi-Supervised Learning for Text-To-Speech

Add code
Bookmark button
Alert button
Oct 27, 2022
Takaaki Saeki, Heiga Zen, Zhehuai Chen, Nobuyuki Morioka, Gary Wang, Yu Zhang, Ankur Bapna, Andrew Rosenberg, Bhuvana Ramabhadran

Figure 1 for Virtuoso: Massive Multilingual Speech-Text Joint Semi-Supervised Learning for Text-To-Speech
Figure 2 for Virtuoso: Massive Multilingual Speech-Text Joint Semi-Supervised Learning for Text-To-Speech
Figure 3 for Virtuoso: Massive Multilingual Speech-Text Joint Semi-Supervised Learning for Text-To-Speech
Figure 4 for Virtuoso: Massive Multilingual Speech-Text Joint Semi-Supervised Learning for Text-To-Speech
Viaarxiv icon

Text-to-speech synthesis from dark data with evaluation-in-the-loop data selection

Add code
Bookmark button
Alert button
Oct 26, 2022
Kentaro Seki, Shinnosuke Takamichi, Takaaki Saeki, Hiroshi Saruwatari

Viaarxiv icon

Spontaneous speech synthesis with linguistic-speech consistency training using pseudo-filled pauses

Add code
Bookmark button
Alert button
Oct 18, 2022
Yuta Matsunaga, Takaaki Saeki, Shinnosuke Takamichi, Hiroshi Saruwatari

Figure 1 for Spontaneous speech synthesis with linguistic-speech consistency training using pseudo-filled pauses
Figure 2 for Spontaneous speech synthesis with linguistic-speech consistency training using pseudo-filled pauses
Figure 3 for Spontaneous speech synthesis with linguistic-speech consistency training using pseudo-filled pauses
Figure 4 for Spontaneous speech synthesis with linguistic-speech consistency training using pseudo-filled pauses
Viaarxiv icon