Picture for Hiroshi Saruwatari

Hiroshi Saruwatari

Acoustic Modeling for End-to-End Empathetic Dialogue Speech Synthesis Using Linguistic and Prosodic Contexts of Dialogue History

Add code
Jun 16, 2022
Figure 1 for Acoustic Modeling for End-to-End Empathetic Dialogue Speech Synthesis Using Linguistic and Prosodic Contexts of Dialogue History
Figure 2 for Acoustic Modeling for End-to-End Empathetic Dialogue Speech Synthesis Using Linguistic and Prosodic Contexts of Dialogue History
Figure 3 for Acoustic Modeling for End-to-End Empathetic Dialogue Speech Synthesis Using Linguistic and Prosodic Contexts of Dialogue History
Figure 4 for Acoustic Modeling for End-to-End Empathetic Dialogue Speech Synthesis Using Linguistic and Prosodic Contexts of Dialogue History
Viaarxiv icon

Region-to-region kernel interpolation of acoustic transfer function with directional weighting

Add code
May 05, 2022
Figure 1 for Region-to-region kernel interpolation of acoustic transfer function with directional weighting
Figure 2 for Region-to-region kernel interpolation of acoustic transfer function with directional weighting
Figure 3 for Region-to-region kernel interpolation of acoustic transfer function with directional weighting
Figure 4 for Region-to-region kernel interpolation of acoustic transfer function with directional weighting
Viaarxiv icon

Speaking-Rate-Controllable HiFi-GAN Using Feature Interpolation

Add code
Apr 22, 2022
Figure 1 for Speaking-Rate-Controllable HiFi-GAN Using Feature Interpolation
Figure 2 for Speaking-Rate-Controllable HiFi-GAN Using Feature Interpolation
Figure 3 for Speaking-Rate-Controllable HiFi-GAN Using Feature Interpolation
Figure 4 for Speaking-Rate-Controllable HiFi-GAN Using Feature Interpolation
Viaarxiv icon

UTMOS: UTokyo-SaruLab System for VoiceMOS Challenge 2022

Add code
Apr 05, 2022
Figure 1 for UTMOS: UTokyo-SaruLab System for VoiceMOS Challenge 2022
Figure 2 for UTMOS: UTokyo-SaruLab System for VoiceMOS Challenge 2022
Figure 3 for UTMOS: UTokyo-SaruLab System for VoiceMOS Challenge 2022
Figure 4 for UTMOS: UTokyo-SaruLab System for VoiceMOS Challenge 2022
Viaarxiv icon

STUDIES: Corpus of Japanese Empathetic Dialogue Speech Towards Friendly Voice Agent

Add code
Mar 28, 2022
Figure 1 for STUDIES: Corpus of Japanese Empathetic Dialogue Speech Towards Friendly Voice Agent
Figure 2 for STUDIES: Corpus of Japanese Empathetic Dialogue Speech Towards Friendly Voice Agent
Figure 3 for STUDIES: Corpus of Japanese Empathetic Dialogue Speech Towards Friendly Voice Agent
Figure 4 for STUDIES: Corpus of Japanese Empathetic Dialogue Speech Towards Friendly Voice Agent
Viaarxiv icon

SelfRemaster: Self-Supervised Speech Restoration with Analysis-by-Synthesis Approach Using Channel Modeling

Add code
Mar 24, 2022
Figure 1 for SelfRemaster: Self-Supervised Speech Restoration with Analysis-by-Synthesis Approach Using Channel Modeling
Figure 2 for SelfRemaster: Self-Supervised Speech Restoration with Analysis-by-Synthesis Approach Using Channel Modeling
Figure 3 for SelfRemaster: Self-Supervised Speech Restoration with Analysis-by-Synthesis Approach Using Channel Modeling
Figure 4 for SelfRemaster: Self-Supervised Speech Restoration with Analysis-by-Synthesis Approach Using Channel Modeling
Viaarxiv icon

Personalized filled-pause generation with group-wise prediction models

Add code
Mar 18, 2022
Figure 1 for Personalized filled-pause generation with group-wise prediction models
Figure 2 for Personalized filled-pause generation with group-wise prediction models
Figure 3 for Personalized filled-pause generation with group-wise prediction models
Figure 4 for Personalized filled-pause generation with group-wise prediction models
Viaarxiv icon

Spatial active noise control based on individual kernel interpolation of primary and secondary sound fields

Add code
Feb 10, 2022
Figure 1 for Spatial active noise control based on individual kernel interpolation of primary and secondary sound fields
Figure 2 for Spatial active noise control based on individual kernel interpolation of primary and secondary sound fields
Figure 3 for Spatial active noise control based on individual kernel interpolation of primary and secondary sound fields
Figure 4 for Spatial active noise control based on individual kernel interpolation of primary and secondary sound fields
Viaarxiv icon

Differentiable Digital Signal Processing Mixture Model for Synthesis Parameter Extraction from Mixture of Harmonic Sounds

Add code
Feb 01, 2022
Figure 1 for Differentiable Digital Signal Processing Mixture Model for Synthesis Parameter Extraction from Mixture of Harmonic Sounds
Figure 2 for Differentiable Digital Signal Processing Mixture Model for Synthesis Parameter Extraction from Mixture of Harmonic Sounds
Figure 3 for Differentiable Digital Signal Processing Mixture Model for Synthesis Parameter Extraction from Mixture of Harmonic Sounds
Viaarxiv icon

J-MAC: Japanese multi-speaker audiobook corpus for speech synthesis

Add code
Jan 26, 2022
Figure 1 for J-MAC: Japanese multi-speaker audiobook corpus for speech synthesis
Figure 2 for J-MAC: Japanese multi-speaker audiobook corpus for speech synthesis
Figure 3 for J-MAC: Japanese multi-speaker audiobook corpus for speech synthesis
Figure 4 for J-MAC: Japanese multi-speaker audiobook corpus for speech synthesis
Viaarxiv icon