Alert button
Picture for Shinji Takaki

Shinji Takaki

Alert button

Embedding a Differentiable Mel-cepstral Synthesis Filter to a Neural Speech Synthesis System

Add code
Bookmark button
Alert button
Nov 21, 2022
Takenori Yoshimura, Shinji Takaki, Kazuhiro Nakamura, Keiichiro Oura, Yukiya Hono, Kei Hashimoto, Yoshihiko Nankaku, Keiichi Tokuda

Figure 1 for Embedding a Differentiable Mel-cepstral Synthesis Filter to a Neural Speech Synthesis System
Figure 2 for Embedding a Differentiable Mel-cepstral Synthesis Filter to a Neural Speech Synthesis System
Figure 3 for Embedding a Differentiable Mel-cepstral Synthesis Filter to a Neural Speech Synthesis System
Figure 4 for Embedding a Differentiable Mel-cepstral Synthesis Filter to a Neural Speech Synthesis System
Viaarxiv icon

Neural Sequence-to-Sequence Speech Synthesis Using a Hidden Semi-Markov Model Based Structured Attention Mechanism

Add code
Bookmark button
Alert button
Aug 31, 2021
Yoshihiko Nankaku, Kenta Sumiya, Takenori Yoshimura, Shinji Takaki, Kei Hashimoto, Keiichiro Oura, Keiichi Tokuda

Figure 1 for Neural Sequence-to-Sequence Speech Synthesis Using a Hidden Semi-Markov Model Based Structured Attention Mechanism
Figure 2 for Neural Sequence-to-Sequence Speech Synthesis Using a Hidden Semi-Markov Model Based Structured Attention Mechanism
Figure 3 for Neural Sequence-to-Sequence Speech Synthesis Using a Hidden Semi-Markov Model Based Structured Attention Mechanism
Viaarxiv icon

PeriodNet: A non-autoregressive waveform generation model with a structure separating periodic and aperiodic components

Add code
Bookmark button
Alert button
Feb 15, 2021
Yukiya Hono, Shinji Takaki, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, Keiichi Tokuda

Figure 1 for PeriodNet: A non-autoregressive waveform generation model with a structure separating periodic and aperiodic components
Figure 2 for PeriodNet: A non-autoregressive waveform generation model with a structure separating periodic and aperiodic components
Figure 3 for PeriodNet: A non-autoregressive waveform generation model with a structure separating periodic and aperiodic components
Figure 4 for PeriodNet: A non-autoregressive waveform generation model with a structure separating periodic and aperiodic components
Viaarxiv icon

Fast and High-Quality Singing Voice Synthesis System based on Convolutional Neural Networks

Add code
Bookmark button
Alert button
Oct 24, 2019
Kazuhiro Nakamura, Shinji Takaki, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, Keiichi Tokuda

Figure 1 for Fast and High-Quality Singing Voice Synthesis System based on Convolutional Neural Networks
Figure 2 for Fast and High-Quality Singing Voice Synthesis System based on Convolutional Neural Networks
Figure 3 for Fast and High-Quality Singing Voice Synthesis System based on Convolutional Neural Networks
Figure 4 for Fast and High-Quality Singing Voice Synthesis System based on Convolutional Neural Networks
Viaarxiv icon

Neural source-filter waveform models for statistical parametric speech synthesis

Add code
Bookmark button
Alert button
Apr 27, 2019
Xin Wang, Shinji Takaki, Junichi Yamagishi

Figure 1 for Neural source-filter waveform models for statistical parametric speech synthesis
Figure 2 for Neural source-filter waveform models for statistical parametric speech synthesis
Figure 3 for Neural source-filter waveform models for statistical parametric speech synthesis
Figure 4 for Neural source-filter waveform models for statistical parametric speech synthesis
Viaarxiv icon

Training a Neural Speech Waveform Model using Spectral Losses of Short-Time Fourier Transform and Continuous Wavelet Transform

Add code
Bookmark button
Alert button
Apr 07, 2019
Shinji Takaki, Hirokazu Kameoka, Junichi Yamagishi

Figure 1 for Training a Neural Speech Waveform Model using Spectral Losses of Short-Time Fourier Transform and Continuous Wavelet Transform
Figure 2 for Training a Neural Speech Waveform Model using Spectral Losses of Short-Time Fourier Transform and Continuous Wavelet Transform
Figure 3 for Training a Neural Speech Waveform Model using Spectral Losses of Short-Time Fourier Transform and Continuous Wavelet Transform
Figure 4 for Training a Neural Speech Waveform Model using Spectral Losses of Short-Time Fourier Transform and Continuous Wavelet Transform
Viaarxiv icon

Neural source-filter-based waveform model for statistical parametric speech synthesis

Add code
Bookmark button
Alert button
Oct 31, 2018
Xin Wang, Shinji Takaki, Junichi Yamagishi

Figure 1 for Neural source-filter-based waveform model for statistical parametric speech synthesis
Figure 2 for Neural source-filter-based waveform model for statistical parametric speech synthesis
Figure 3 for Neural source-filter-based waveform model for statistical parametric speech synthesis
Figure 4 for Neural source-filter-based waveform model for statistical parametric speech synthesis
Viaarxiv icon

STFT spectral loss for training a neural speech waveform model

Add code
Bookmark button
Alert button
Oct 30, 2018
Shinji Takaki, Toru Nakashika, Xin Wang, Junichi Yamagishi

Figure 1 for STFT spectral loss for training a neural speech waveform model
Figure 2 for STFT spectral loss for training a neural speech waveform model
Figure 3 for STFT spectral loss for training a neural speech waveform model
Figure 4 for STFT spectral loss for training a neural speech waveform model
Viaarxiv icon

Investigation of enhanced Tacotron text-to-speech synthesis systems with self-attention for pitch accent language

Add code
Bookmark button
Alert button
Oct 29, 2018
Yusuke Yasuda, Xin Wang, Shinji Takaki, Junichi Yamagishi

Figure 1 for Investigation of enhanced Tacotron text-to-speech synthesis systems with self-attention for pitch accent language
Figure 2 for Investigation of enhanced Tacotron text-to-speech synthesis systems with self-attention for pitch accent language
Figure 3 for Investigation of enhanced Tacotron text-to-speech synthesis systems with self-attention for pitch accent language
Figure 4 for Investigation of enhanced Tacotron text-to-speech synthesis systems with self-attention for pitch accent language
Viaarxiv icon

Wasserstein GAN and Waveform Loss-based Acoustic Model Training for Multi-speaker Text-to-Speech Synthesis Systems Using a WaveNet Vocoder

Add code
Bookmark button
Alert button
Jul 31, 2018
Yi Zhao, Shinji Takaki, Hieu-Thi Luong, Junichi Yamagishi, Daisuke Saito, Nobuaki Minematsu

Figure 1 for Wasserstein GAN and Waveform Loss-based Acoustic Model Training for Multi-speaker Text-to-Speech Synthesis Systems Using a WaveNet Vocoder
Figure 2 for Wasserstein GAN and Waveform Loss-based Acoustic Model Training for Multi-speaker Text-to-Speech Synthesis Systems Using a WaveNet Vocoder
Figure 3 for Wasserstein GAN and Waveform Loss-based Acoustic Model Training for Multi-speaker Text-to-Speech Synthesis Systems Using a WaveNet Vocoder
Figure 4 for Wasserstein GAN and Waveform Loss-based Acoustic Model Training for Multi-speaker Text-to-Speech Synthesis Systems Using a WaveNet Vocoder
Viaarxiv icon