Voice Conversion


Voice conversion is the process of converting the voice of one speaker into the voice of another speaker.

Toward Metaphor-Fluid Conversation Design for Voice User Interfaces

Add code
Feb 17, 2025
Viaarxiv icon

FocalCodec: Low-Bitrate Speech Coding via Focal Modulation Networks

Add code
Feb 06, 2025
Figure 1 for FocalCodec: Low-Bitrate Speech Coding via Focal Modulation Networks
Figure 2 for FocalCodec: Low-Bitrate Speech Coding via Focal Modulation Networks
Figure 3 for FocalCodec: Low-Bitrate Speech Coding via Focal Modulation Networks
Figure 4 for FocalCodec: Low-Bitrate Speech Coding via Focal Modulation Networks
Viaarxiv icon

ASVspoof 5: Design, Collection and Validation of Resources for Spoofing, Deepfake, and Adversarial Attack Detection Using Crowdsourced Speech

Add code
Feb 13, 2025
Figure 1 for ASVspoof 5: Design, Collection and Validation of Resources for Spoofing, Deepfake, and Adversarial Attack Detection Using Crowdsourced Speech
Figure 2 for ASVspoof 5: Design, Collection and Validation of Resources for Spoofing, Deepfake, and Adversarial Attack Detection Using Crowdsourced Speech
Figure 3 for ASVspoof 5: Design, Collection and Validation of Resources for Spoofing, Deepfake, and Adversarial Attack Detection Using Crowdsourced Speech
Figure 4 for ASVspoof 5: Design, Collection and Validation of Resources for Spoofing, Deepfake, and Adversarial Attack Detection Using Crowdsourced Speech
Viaarxiv icon

Beyond the Monitor: Mixed Reality Visualization and AI for Enhanced Digital Pathology Workflow

Add code
May 05, 2025
Viaarxiv icon

SeniorTalk: A Chinese Conversation Dataset with Rich Annotations for Super-Aged Seniors

Add code
Mar 20, 2025
Viaarxiv icon

Vevo: Controllable Zero-Shot Voice Imitation with Self-Supervised Disentanglement

Add code
Feb 11, 2025
Viaarxiv icon

GatedxLSTM: A Multimodal Affective Computing Approach for Emotion Recognition in Conversations

Add code
Mar 26, 2025
Viaarxiv icon

Everyone-Can-Sing: Zero-Shot Singing Voice Synthesis and Conversion with Speech Reference

Add code
Jan 23, 2025
Viaarxiv icon

EmoReg: Directional Latent Vector Modeling for Emotional Intensity Regularization in Diffusion-based Voice Conversion

Add code
Dec 29, 2024
Figure 1 for EmoReg: Directional Latent Vector Modeling for Emotional Intensity Regularization in Diffusion-based Voice Conversion
Figure 2 for EmoReg: Directional Latent Vector Modeling for Emotional Intensity Regularization in Diffusion-based Voice Conversion
Figure 3 for EmoReg: Directional Latent Vector Modeling for Emotional Intensity Regularization in Diffusion-based Voice Conversion
Figure 4 for EmoReg: Directional Latent Vector Modeling for Emotional Intensity Regularization in Diffusion-based Voice Conversion
Viaarxiv icon

Metis: A Foundation Speech Generation Model with Masked Generative Pre-training

Add code
Feb 05, 2025
Viaarxiv icon