Alert button
Picture for Shinji Takaki

Shinji Takaki

Alert button

Embedding a Differentiable Mel-cepstral Synthesis Filter to a Neural Speech Synthesis System

Nov 21, 2022
Takenori Yoshimura, Shinji Takaki, Kazuhiro Nakamura, Keiichiro Oura, Yukiya Hono, Kei Hashimoto, Yoshihiko Nankaku, Keiichi Tokuda

Figure 1 for Embedding a Differentiable Mel-cepstral Synthesis Filter to a Neural Speech Synthesis System
Figure 2 for Embedding a Differentiable Mel-cepstral Synthesis Filter to a Neural Speech Synthesis System
Figure 3 for Embedding a Differentiable Mel-cepstral Synthesis Filter to a Neural Speech Synthesis System
Figure 4 for Embedding a Differentiable Mel-cepstral Synthesis Filter to a Neural Speech Synthesis System
Viaarxiv icon

Neural Sequence-to-Sequence Speech Synthesis Using a Hidden Semi-Markov Model Based Structured Attention Mechanism

Aug 31, 2021
Yoshihiko Nankaku, Kenta Sumiya, Takenori Yoshimura, Shinji Takaki, Kei Hashimoto, Keiichiro Oura, Keiichi Tokuda

Figure 1 for Neural Sequence-to-Sequence Speech Synthesis Using a Hidden Semi-Markov Model Based Structured Attention Mechanism
Figure 2 for Neural Sequence-to-Sequence Speech Synthesis Using a Hidden Semi-Markov Model Based Structured Attention Mechanism
Figure 3 for Neural Sequence-to-Sequence Speech Synthesis Using a Hidden Semi-Markov Model Based Structured Attention Mechanism
Viaarxiv icon

PeriodNet: A non-autoregressive waveform generation model with a structure separating periodic and aperiodic components

Feb 15, 2021
Yukiya Hono, Shinji Takaki, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, Keiichi Tokuda

Figure 1 for PeriodNet: A non-autoregressive waveform generation model with a structure separating periodic and aperiodic components
Figure 2 for PeriodNet: A non-autoregressive waveform generation model with a structure separating periodic and aperiodic components
Figure 3 for PeriodNet: A non-autoregressive waveform generation model with a structure separating periodic and aperiodic components
Figure 4 for PeriodNet: A non-autoregressive waveform generation model with a structure separating periodic and aperiodic components
Viaarxiv icon

Fast and High-Quality Singing Voice Synthesis System based on Convolutional Neural Networks

Oct 24, 2019
Kazuhiro Nakamura, Shinji Takaki, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, Keiichi Tokuda

Figure 1 for Fast and High-Quality Singing Voice Synthesis System based on Convolutional Neural Networks
Figure 2 for Fast and High-Quality Singing Voice Synthesis System based on Convolutional Neural Networks
Figure 3 for Fast and High-Quality Singing Voice Synthesis System based on Convolutional Neural Networks
Figure 4 for Fast and High-Quality Singing Voice Synthesis System based on Convolutional Neural Networks
Viaarxiv icon

Neural source-filter waveform models for statistical parametric speech synthesis

Apr 27, 2019
Xin Wang, Shinji Takaki, Junichi Yamagishi

Figure 1 for Neural source-filter waveform models for statistical parametric speech synthesis
Figure 2 for Neural source-filter waveform models for statistical parametric speech synthesis
Figure 3 for Neural source-filter waveform models for statistical parametric speech synthesis
Figure 4 for Neural source-filter waveform models for statistical parametric speech synthesis
Viaarxiv icon

Training a Neural Speech Waveform Model using Spectral Losses of Short-Time Fourier Transform and Continuous Wavelet Transform

Apr 07, 2019
Shinji Takaki, Hirokazu Kameoka, Junichi Yamagishi

Figure 1 for Training a Neural Speech Waveform Model using Spectral Losses of Short-Time Fourier Transform and Continuous Wavelet Transform
Figure 2 for Training a Neural Speech Waveform Model using Spectral Losses of Short-Time Fourier Transform and Continuous Wavelet Transform
Figure 3 for Training a Neural Speech Waveform Model using Spectral Losses of Short-Time Fourier Transform and Continuous Wavelet Transform
Figure 4 for Training a Neural Speech Waveform Model using Spectral Losses of Short-Time Fourier Transform and Continuous Wavelet Transform
Viaarxiv icon

Neural source-filter-based waveform model for statistical parametric speech synthesis

Oct 31, 2018
Xin Wang, Shinji Takaki, Junichi Yamagishi

Figure 1 for Neural source-filter-based waveform model for statistical parametric speech synthesis
Figure 2 for Neural source-filter-based waveform model for statistical parametric speech synthesis
Figure 3 for Neural source-filter-based waveform model for statistical parametric speech synthesis
Figure 4 for Neural source-filter-based waveform model for statistical parametric speech synthesis
Viaarxiv icon

STFT spectral loss for training a neural speech waveform model

Oct 30, 2018
Shinji Takaki, Toru Nakashika, Xin Wang, Junichi Yamagishi

Figure 1 for STFT spectral loss for training a neural speech waveform model
Figure 2 for STFT spectral loss for training a neural speech waveform model
Figure 3 for STFT spectral loss for training a neural speech waveform model
Figure 4 for STFT spectral loss for training a neural speech waveform model
Viaarxiv icon

Investigation of enhanced Tacotron text-to-speech synthesis systems with self-attention for pitch accent language

Oct 29, 2018
Yusuke Yasuda, Xin Wang, Shinji Takaki, Junichi Yamagishi

Figure 1 for Investigation of enhanced Tacotron text-to-speech synthesis systems with self-attention for pitch accent language
Figure 2 for Investigation of enhanced Tacotron text-to-speech synthesis systems with self-attention for pitch accent language
Figure 3 for Investigation of enhanced Tacotron text-to-speech synthesis systems with self-attention for pitch accent language
Figure 4 for Investigation of enhanced Tacotron text-to-speech synthesis systems with self-attention for pitch accent language
Viaarxiv icon

Wasserstein GAN and Waveform Loss-based Acoustic Model Training for Multi-speaker Text-to-Speech Synthesis Systems Using a WaveNet Vocoder

Jul 31, 2018
Yi Zhao, Shinji Takaki, Hieu-Thi Luong, Junichi Yamagishi, Daisuke Saito, Nobuaki Minematsu

Figure 1 for Wasserstein GAN and Waveform Loss-based Acoustic Model Training for Multi-speaker Text-to-Speech Synthesis Systems Using a WaveNet Vocoder
Figure 2 for Wasserstein GAN and Waveform Loss-based Acoustic Model Training for Multi-speaker Text-to-Speech Synthesis Systems Using a WaveNet Vocoder
Figure 3 for Wasserstein GAN and Waveform Loss-based Acoustic Model Training for Multi-speaker Text-to-Speech Synthesis Systems Using a WaveNet Vocoder
Figure 4 for Wasserstein GAN and Waveform Loss-based Acoustic Model Training for Multi-speaker Text-to-Speech Synthesis Systems Using a WaveNet Vocoder
Viaarxiv icon