Alert button
Picture for Hiroshi Saruwatari

Hiroshi Saruwatari

Alert button

RALL-E: Robust Codec Language Modeling with Chain-of-Thought Prompting for Text-to-Speech Synthesis

Add code
Bookmark button
Alert button
Apr 06, 2024
Detai Xin, Xu Tan, Kai Shen, Zeqian Ju, Dongchao Yang, Yuancheng Wang, Shinnosuke Takamichi, Hiroshi Saruwatari, Shujie Liu, Jinyu Li, Sheng Zhao

Viaarxiv icon

Building speech corpus with diverse voice characteristics for its prompt-based representation

Add code
Bookmark button
Alert button
Mar 20, 2024
Aya Watanabe, Shinnosuke Takamichi, Yuki Saito, Wataru Nakata, Detai Xin, Hiroshi Saruwatari

Figure 1 for Building speech corpus with diverse voice characteristics for its prompt-based representation
Figure 2 for Building speech corpus with diverse voice characteristics for its prompt-based representation
Figure 3 for Building speech corpus with diverse voice characteristics for its prompt-based representation
Figure 4 for Building speech corpus with diverse voice characteristics for its prompt-based representation
Viaarxiv icon

Real-time Speech Extraction Using Spatially Regularized Independent Low-rank Matrix Analysis and Rank-constrained Spatial Covariance Matrix Estimation

Add code
Bookmark button
Alert button
Mar 19, 2024
Yuto Ishikawa, Kohei Konaka, Tomohiko Nakamura, Norihiro Takamune, Hiroshi Saruwatari

Figure 1 for Real-time Speech Extraction Using Spatially Regularized Independent Low-rank Matrix Analysis and Rank-constrained Spatial Covariance Matrix Estimation
Figure 2 for Real-time Speech Extraction Using Spatially Regularized Independent Low-rank Matrix Analysis and Rank-constrained Spatial Covariance Matrix Estimation
Figure 3 for Real-time Speech Extraction Using Spatially Regularized Independent Low-rank Matrix Analysis and Rank-constrained Spatial Covariance Matrix Estimation
Viaarxiv icon

SpeechBERTScore: Reference-Aware Automatic Evaluation of Speech Generation Leveraging NLP Evaluation Metrics

Add code
Bookmark button
Alert button
Jan 30, 2024
Takaaki Saeki, Soumi Maiti, Shinnosuke Takamichi, Shinji Watanabe, Hiroshi Saruwatari

Viaarxiv icon

Localizing Acoustic Energy in Sound Field Synthesis by Directionally Weighted Exterior Radiation Suppression

Add code
Bookmark button
Alert button
Jan 11, 2024
Yoshihide Tomita, Shoichi Koyama, Hiroshi Saruwatari

Viaarxiv icon

JVNV: A Corpus of Japanese Emotional Speech with Verbal Content and Nonverbal Expressions

Add code
Bookmark button
Alert button
Oct 09, 2023
Detai Xin, Junfeng Jiang, Shinnosuke Takamichi, Yuki Saito, Akiko Aizawa, Hiroshi Saruwatari

Figure 1 for JVNV: A Corpus of Japanese Emotional Speech with Verbal Content and Nonverbal Expressions
Figure 2 for JVNV: A Corpus of Japanese Emotional Speech with Verbal Content and Nonverbal Expressions
Figure 3 for JVNV: A Corpus of Japanese Emotional Speech with Verbal Content and Nonverbal Expressions
Figure 4 for JVNV: A Corpus of Japanese Emotional Speech with Verbal Content and Nonverbal Expressions
Viaarxiv icon

Coco-Nut: Corpus of Japanese Utterance and Voice Characteristics Description for Prompt-based Control

Add code
Bookmark button
Alert button
Sep 24, 2023
Aya Watanabe, Shinnosuke Takamichi, Yuki Saito, Wataru Nakata, Detai Xin, Hiroshi Saruwatari

Figure 1 for Coco-Nut: Corpus of Japanese Utterance and Voice Characteristics Description for Prompt-based Control
Figure 2 for Coco-Nut: Corpus of Japanese Utterance and Voice Characteristics Description for Prompt-based Control
Figure 3 for Coco-Nut: Corpus of Japanese Utterance and Voice Characteristics Description for Prompt-based Control
Figure 4 for Coco-Nut: Corpus of Japanese Utterance and Voice Characteristics Description for Prompt-based Control
Viaarxiv icon

Do learned speech symbols follow Zipf's law?

Add code
Bookmark button
Alert button
Sep 18, 2023
Shinnosuke Takamichi, Hiroki Maeda, Joonyong Park, Daisuke Saito, Hiroshi Saruwatari

Figure 1 for Do learned speech symbols follow Zipf's law?
Figure 2 for Do learned speech symbols follow Zipf's law?
Figure 3 for Do learned speech symbols follow Zipf's law?
Figure 4 for Do learned speech symbols follow Zipf's law?
Viaarxiv icon

Diversity-based core-set selection for text-to-speech with linguistic and acoustic features

Add code
Bookmark button
Alert button
Sep 15, 2023
Kentaro Seki, Shinnosuke Takamichi, Takaaki Saeki, Hiroshi Saruwatari

Figure 1 for Diversity-based core-set selection for text-to-speech with linguistic and acoustic features
Figure 2 for Diversity-based core-set selection for text-to-speech with linguistic and acoustic features
Figure 3 for Diversity-based core-set selection for text-to-speech with linguistic and acoustic features
Figure 4 for Diversity-based core-set selection for text-to-speech with linguistic and acoustic features
Viaarxiv icon

Kernel Interpolation of Incident Sound Field in Region Including Scattering Objects

Add code
Bookmark button
Alert button
Sep 11, 2023
Shoichi Koyama, Masaki Nakada, Juliano G. C. Ribeiro, Hiroshi Saruwatari

Figure 1 for Kernel Interpolation of Incident Sound Field in Region Including Scattering Objects
Figure 2 for Kernel Interpolation of Incident Sound Field in Region Including Scattering Objects
Figure 3 for Kernel Interpolation of Incident Sound Field in Region Including Scattering Objects
Figure 4 for Kernel Interpolation of Incident Sound Field in Region Including Scattering Objects
Viaarxiv icon