Picture for Shinnosuke Takamichi

Shinnosuke Takamichi

Coco-Nut: Corpus of Japanese Utterance and Voice Characteristics Description for Prompt-based Control

Add code
Sep 24, 2023
Figure 1 for Coco-Nut: Corpus of Japanese Utterance and Voice Characteristics Description for Prompt-based Control
Figure 2 for Coco-Nut: Corpus of Japanese Utterance and Voice Characteristics Description for Prompt-based Control
Figure 3 for Coco-Nut: Corpus of Japanese Utterance and Voice Characteristics Description for Prompt-based Control
Figure 4 for Coco-Nut: Corpus of Japanese Utterance and Voice Characteristics Description for Prompt-based Control
Viaarxiv icon

Do learned speech symbols follow Zipf's law?

Add code
Sep 18, 2023
Figure 1 for Do learned speech symbols follow Zipf's law?
Figure 2 for Do learned speech symbols follow Zipf's law?
Figure 3 for Do learned speech symbols follow Zipf's law?
Figure 4 for Do learned speech symbols follow Zipf's law?
Viaarxiv icon

Diversity-based core-set selection for text-to-speech with linguistic and acoustic features

Add code
Sep 15, 2023
Figure 1 for Diversity-based core-set selection for text-to-speech with linguistic and acoustic features
Figure 2 for Diversity-based core-set selection for text-to-speech with linguistic and acoustic features
Figure 3 for Diversity-based core-set selection for text-to-speech with linguistic and acoustic features
Figure 4 for Diversity-based core-set selection for text-to-speech with linguistic and acoustic features
Viaarxiv icon

How Generative Spoken Language Modeling Encodes Noisy Speech: Investigation from Phonetics to Syntactics

Add code
Jun 01, 2023
Figure 1 for How Generative Spoken Language Modeling Encodes Noisy Speech: Investigation from Phonetics to Syntactics
Figure 2 for How Generative Spoken Language Modeling Encodes Noisy Speech: Investigation from Phonetics to Syntactics
Figure 3 for How Generative Spoken Language Modeling Encodes Noisy Speech: Investigation from Phonetics to Syntactics
Figure 4 for How Generative Spoken Language Modeling Encodes Noisy Speech: Investigation from Phonetics to Syntactics
Viaarxiv icon

Laughter Synthesis using Pseudo Phonetic Tokens with a Large-scale In-the-wild Laughter Corpus

Add code
May 26, 2023
Figure 1 for Laughter Synthesis using Pseudo Phonetic Tokens with a Large-scale In-the-wild Laughter Corpus
Figure 2 for Laughter Synthesis using Pseudo Phonetic Tokens with a Large-scale In-the-wild Laughter Corpus
Figure 3 for Laughter Synthesis using Pseudo Phonetic Tokens with a Large-scale In-the-wild Laughter Corpus
Figure 4 for Laughter Synthesis using Pseudo Phonetic Tokens with a Large-scale In-the-wild Laughter Corpus
Viaarxiv icon

ChatGPT-EDSS: Empathetic Dialogue Speech Synthesis Trained from ChatGPT-derived Context Word Embeddings

Add code
May 23, 2023
Figure 1 for ChatGPT-EDSS: Empathetic Dialogue Speech Synthesis Trained from ChatGPT-derived Context Word Embeddings
Figure 2 for ChatGPT-EDSS: Empathetic Dialogue Speech Synthesis Trained from ChatGPT-derived Context Word Embeddings
Figure 3 for ChatGPT-EDSS: Empathetic Dialogue Speech Synthesis Trained from ChatGPT-derived Context Word Embeddings
Figure 4 for ChatGPT-EDSS: Empathetic Dialogue Speech Synthesis Trained from ChatGPT-derived Context Word Embeddings
Viaarxiv icon

CALLS: Japanese Empathetic Dialogue Speech Corpus of Complaint Handling and Attentive Listening in Customer Center

Add code
May 23, 2023
Viaarxiv icon

JNV Corpus: A Corpus of Japanese Nonverbal Vocalizations with Diverse Phrases and Emotions

Add code
May 21, 2023
Viaarxiv icon

Environmental sound conversion from vocal imitations and sound event labels

Add code
Apr 29, 2023
Viaarxiv icon

Learning to Speak from Text: Zero-Shot Multilingual Text-to-Speech with Unsupervised Text Pretraining

Add code
Feb 05, 2023
Figure 1 for Learning to Speak from Text: Zero-Shot Multilingual Text-to-Speech with Unsupervised Text Pretraining
Figure 2 for Learning to Speak from Text: Zero-Shot Multilingual Text-to-Speech with Unsupervised Text Pretraining
Figure 3 for Learning to Speak from Text: Zero-Shot Multilingual Text-to-Speech with Unsupervised Text Pretraining
Figure 4 for Learning to Speak from Text: Zero-Shot Multilingual Text-to-Speech with Unsupervised Text Pretraining
Viaarxiv icon