Picture for Yabin Li

Yabin Li

MELA-TTS: Joint transformer-diffusion model with representation alignment for speech synthesis

Add code
Sep 18, 2025
Viaarxiv icon

FunAudio-ASR Technical Report

Add code
Sep 15, 2025
Viaarxiv icon

CosyVoice 3: Towards In-the-wild Speech Generation via Scaling-up and Post-training

Add code
May 23, 2025
Viaarxiv icon

MinMo: A Multimodal Large Language Model for Seamless Voice Interaction

Add code
Jan 10, 2025
Figure 1 for MinMo: A Multimodal Large Language Model for Seamless Voice Interaction
Figure 2 for MinMo: A Multimodal Large Language Model for Seamless Voice Interaction
Figure 3 for MinMo: A Multimodal Large Language Model for Seamless Voice Interaction
Figure 4 for MinMo: A Multimodal Large Language Model for Seamless Voice Interaction
Viaarxiv icon

FunASR: A Fundamental End-to-End Speech Recognition Toolkit

Add code
May 18, 2023
Viaarxiv icon