Picture for Kenichi Fujita

Kenichi Fujita

Lightweight Zero-shot Text-to-Speech with Mixture of Adapters

Add code
Jul 01, 2024
Viaarxiv icon

Speech Rhythm-Based Speaker Embeddings Extraction from Phonemes and Phoneme Duration for Multi-Speaker Speech Synthesis

Add code
Feb 11, 2024
Viaarxiv icon

Noise-robust zero-shot text-to-speech synthesis conditioned on self-supervised speech-representation model with adapters

Add code
Jan 10, 2024
Figure 1 for Noise-robust zero-shot text-to-speech synthesis conditioned on self-supervised speech-representation model with adapters
Figure 2 for Noise-robust zero-shot text-to-speech synthesis conditioned on self-supervised speech-representation model with adapters
Figure 3 for Noise-robust zero-shot text-to-speech synthesis conditioned on self-supervised speech-representation model with adapters
Figure 4 for Noise-robust zero-shot text-to-speech synthesis conditioned on self-supervised speech-representation model with adapters
Viaarxiv icon

Zero-shot text-to-speech synthesis conditioned using self-supervised speech representation model

Add code
Apr 24, 2023
Figure 1 for Zero-shot text-to-speech synthesis conditioned using self-supervised speech representation model
Figure 2 for Zero-shot text-to-speech synthesis conditioned using self-supervised speech representation model
Figure 3 for Zero-shot text-to-speech synthesis conditioned using self-supervised speech representation model
Figure 4 for Zero-shot text-to-speech synthesis conditioned using self-supervised speech representation model
Viaarxiv icon