Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Direct Noisy Speech Modeling for Noisy-to-Noisy Voice Conversion

Nov 13, 2021

Chao Xie, Yi-Chiao Wu, Patrick Lumban Tobing, Wen-Chin Huang, Tomoki Toda

Figure 1 for Direct Noisy Speech Modeling for Noisy-to-Noisy Voice Conversion

Figure 2 for Direct Noisy Speech Modeling for Noisy-to-Noisy Voice Conversion

Figure 3 for Direct Noisy Speech Modeling for Noisy-to-Noisy Voice Conversion

Figure 4 for Direct Noisy Speech Modeling for Noisy-to-Noisy Voice Conversion

Share this with someone who'll enjoy it:

Abstract:Beyond the conventional voice conversion (VC) where the speaker information is converted without altering the linguistic content, the background sounds are informative and need to be retained in some real-world scenarios, such as VC in movie/video and VC in music where the voice is entangled with background sounds. As a new VC framework, we have developed a noisy-to-noisy (N2N) VC framework to convert the speaker's identity while preserving the background sounds. Although our framework consisting of a denoising module and a VC module well handles the background sounds, the VC module is sensitive to the distortion caused by the denoising module. To address this distortion issue, in this paper we propose the improved VC module to directly model the noisy speech waveform while controlling the background sounds. The experimental results have demonstrated that our improved framework significantly outperforms the previous one and achieves an acceptable score in terms of naturalness, while reaching comparable similarity performance to the upper bound of our framework.

View paper on

Share this with someone who'll enjoy it:

Title:Direct Noisy Speech Modeling for Noisy-to-Noisy Voice Conversion

Paper and Code