Picture for Junichi Yamagishi

Junichi Yamagishi

Text-to-Speech Synthesis Techniques for MIDI-to-Audio Synthesis

Add code
May 17, 2021
Figure 1 for Text-to-Speech Synthesis Techniques for MIDI-to-Audio Synthesis
Figure 2 for Text-to-Speech Synthesis Techniques for MIDI-to-Audio Synthesis
Figure 3 for Text-to-Speech Synthesis Techniques for MIDI-to-Audio Synthesis
Figure 4 for Text-to-Speech Synthesis Techniques for MIDI-to-Audio Synthesis
Viaarxiv icon

How do Voices from Past Speech Synthesis Challenges Compare Today?

Add code
May 13, 2021
Figure 1 for How do Voices from Past Speech Synthesis Challenges Compare Today?
Figure 2 for How do Voices from Past Speech Synthesis Challenges Compare Today?
Figure 3 for How do Voices from Past Speech Synthesis Challenges Compare Today?
Figure 4 for How do Voices from Past Speech Synthesis Challenges Compare Today?
Viaarxiv icon

Exploring Disentanglement with Multilingual and Monolingual VQ-VAE

Add code
May 04, 2021
Figure 1 for Exploring Disentanglement with Multilingual and Monolingual VQ-VAE
Figure 2 for Exploring Disentanglement with Multilingual and Monolingual VQ-VAE
Figure 3 for Exploring Disentanglement with Multilingual and Monolingual VQ-VAE
Figure 4 for Exploring Disentanglement with Multilingual and Monolingual VQ-VAE
Viaarxiv icon

Fashion-Guided Adversarial Attack on Person Segmentation

Add code
Apr 20, 2021
Figure 1 for Fashion-Guided Adversarial Attack on Person Segmentation
Figure 2 for Fashion-Guided Adversarial Attack on Person Segmentation
Figure 3 for Fashion-Guided Adversarial Attack on Person Segmentation
Figure 4 for Fashion-Guided Adversarial Attack on Person Segmentation
Viaarxiv icon

Multi-Metric Optimization using Generative Adversarial Networks for Near-End Speech Intelligibility Enhancement

Add code
Apr 17, 2021
Figure 1 for Multi-Metric Optimization using Generative Adversarial Networks for Near-End Speech Intelligibility Enhancement
Figure 2 for Multi-Metric Optimization using Generative Adversarial Networks for Near-End Speech Intelligibility Enhancement
Figure 3 for Multi-Metric Optimization using Generative Adversarial Networks for Near-End Speech Intelligibility Enhancement
Figure 4 for Multi-Metric Optimization using Generative Adversarial Networks for Near-End Speech Intelligibility Enhancement
Viaarxiv icon

An Initial Investigation for Detecting Partially Spoofed Audio

Add code
Apr 06, 2021
Figure 1 for An Initial Investigation for Detecting Partially Spoofed Audio
Figure 2 for An Initial Investigation for Detecting Partially Spoofed Audio
Figure 3 for An Initial Investigation for Detecting Partially Spoofed Audio
Figure 4 for An Initial Investigation for Detecting Partially Spoofed Audio
Viaarxiv icon

Attention Back-end for Automatic Speaker Verification with Multiple Enrollment Utterances

Add code
Apr 04, 2021
Figure 1 for Attention Back-end for Automatic Speaker Verification with Multiple Enrollment Utterances
Figure 2 for Attention Back-end for Automatic Speaker Verification with Multiple Enrollment Utterances
Figure 3 for Attention Back-end for Automatic Speaker Verification with Multiple Enrollment Utterances
Figure 4 for Attention Back-end for Automatic Speaker Verification with Multiple Enrollment Utterances
Viaarxiv icon

ASVspoof 2019: spoofing countermeasures for the detection of synthesized, converted and replayed speech

Add code
Feb 11, 2021
Figure 1 for ASVspoof 2019: spoofing countermeasures for the detection of synthesized, converted and replayed speech
Figure 2 for ASVspoof 2019: spoofing countermeasures for the detection of synthesized, converted and replayed speech
Figure 3 for ASVspoof 2019: spoofing countermeasures for the detection of synthesized, converted and replayed speech
Figure 4 for ASVspoof 2019: spoofing countermeasures for the detection of synthesized, converted and replayed speech
Viaarxiv icon

Pretraining Strategies, Waveform Model Choice, and Acoustic Configurations for Multi-Speaker End-to-End Speech Synthesis

Add code
Nov 10, 2020
Figure 1 for Pretraining Strategies, Waveform Model Choice, and Acoustic Configurations for Multi-Speaker End-to-End Speech Synthesis
Figure 2 for Pretraining Strategies, Waveform Model Choice, and Acoustic Configurations for Multi-Speaker End-to-End Speech Synthesis
Figure 3 for Pretraining Strategies, Waveform Model Choice, and Acoustic Configurations for Multi-Speaker End-to-End Speech Synthesis
Figure 4 for Pretraining Strategies, Waveform Model Choice, and Acoustic Configurations for Multi-Speaker End-to-End Speech Synthesis
Viaarxiv icon

Learning Disentangled Phone and Speaker Representations in a Semi-Supervised VQ-VAE Paradigm

Add code
Oct 21, 2020
Figure 1 for Learning Disentangled Phone and Speaker Representations in a Semi-Supervised VQ-VAE Paradigm
Figure 2 for Learning Disentangled Phone and Speaker Representations in a Semi-Supervised VQ-VAE Paradigm
Figure 3 for Learning Disentangled Phone and Speaker Representations in a Semi-Supervised VQ-VAE Paradigm
Figure 4 for Learning Disentangled Phone and Speaker Representations in a Semi-Supervised VQ-VAE Paradigm
Viaarxiv icon