Picture for Edresson Casanova

Edresson Casanova

The Impact of Prosodic Segmentation on Speech Synthesis of Spontaneous Speech

Add code
Nov 06, 2025
Viaarxiv icon

HiFiTTS-2: A Large-Scale High Bandwidth Speech Dataset

Add code
Jun 04, 2025
Viaarxiv icon

Efficient and Direct Duplex Modeling for Speech-to-Speech Language Model

Add code
May 21, 2025
Figure 1 for Efficient and Direct Duplex Modeling for Speech-to-Speech Language Model
Figure 2 for Efficient and Direct Duplex Modeling for Speech-to-Speech Language Model
Figure 3 for Efficient and Direct Duplex Modeling for Speech-to-Speech Language Model
Figure 4 for Efficient and Direct Duplex Modeling for Speech-to-Speech Language Model
Viaarxiv icon

Koel-TTS: Enhancing LLM based Speech Generation with Preference Alignment and Classifier Free Guidance

Add code
Feb 07, 2025
Figure 1 for Koel-TTS: Enhancing LLM based Speech Generation with Preference Alignment and Classifier Free Guidance
Figure 2 for Koel-TTS: Enhancing LLM based Speech Generation with Preference Alignment and Classifier Free Guidance
Figure 3 for Koel-TTS: Enhancing LLM based Speech Generation with Preference Alignment and Classifier Free Guidance
Figure 4 for Koel-TTS: Enhancing LLM based Speech Generation with Preference Alignment and Classifier Free Guidance
Viaarxiv icon

FreeSVC: Towards Zero-shot Multilingual Singing Voice Conversion

Add code
Jan 09, 2025
Viaarxiv icon

Low Frame-rate Speech Codec: a Codec Designed for Fast High-quality Speech LLM Training and Inference

Add code
Sep 18, 2024
Figure 1 for Low Frame-rate Speech Codec: a Codec Designed for Fast High-quality Speech LLM Training and Inference
Viaarxiv icon

XTTS: a Massively Multilingual Zero-Shot Text-to-Speech Model

Add code
Jun 07, 2024
Viaarxiv icon

MLAAD: The Multi-Language Audio Anti-Spoofing Dataset

Add code
Jan 17, 2024
Figure 1 for MLAAD: The Multi-Language Audio Anti-Spoofing Dataset
Figure 2 for MLAAD: The Multi-Language Audio Anti-Spoofing Dataset
Figure 3 for MLAAD: The Multi-Language Audio Anti-Spoofing Dataset
Figure 4 for MLAAD: The Multi-Language Audio Anti-Spoofing Dataset
Viaarxiv icon

CML-TTS A Multilingual Dataset for Speech Synthesis in Low-Resource Languages

Add code
Jun 16, 2023
Viaarxiv icon

Evaluation of Speech Representations for MOS prediction

Add code
Jun 16, 2023
Figure 1 for Evaluation of Speech Representations for MOS prediction
Figure 2 for Evaluation of Speech Representations for MOS prediction
Figure 3 for Evaluation of Speech Representations for MOS prediction
Figure 4 for Evaluation of Speech Representations for MOS prediction
Viaarxiv icon