Picture for Heng Lu

Heng Lu

CosyVoice: A Scalable Multilingual Zero-shot Text-to-speech Synthesizer based on Supervised Semantic Tokens

Add code
Jul 09, 2024
Viaarxiv icon

The Impact of Quantization and Pruning on Deep Reinforcement Learning Models

Add code
Jul 05, 2024
Viaarxiv icon

GMP-ATL: Gender-augmented Multi-scale Pseudo-label Enhanced Adaptive Transfer Learning for Speech Emotion Recognition via HuBERT

Add code
May 03, 2024
Figure 1 for GMP-ATL: Gender-augmented Multi-scale Pseudo-label Enhanced Adaptive Transfer Learning for Speech Emotion Recognition via HuBERT
Figure 2 for GMP-ATL: Gender-augmented Multi-scale Pseudo-label Enhanced Adaptive Transfer Learning for Speech Emotion Recognition via HuBERT
Figure 3 for GMP-ATL: Gender-augmented Multi-scale Pseudo-label Enhanced Adaptive Transfer Learning for Speech Emotion Recognition via HuBERT
Figure 4 for GMP-ATL: Gender-augmented Multi-scale Pseudo-label Enhanced Adaptive Transfer Learning for Speech Emotion Recognition via HuBERT
Viaarxiv icon

Vec-Tok Speech: speech vectorization and tokenization for neural speech generation

Add code
Oct 12, 2023
Viaarxiv icon

SALT: Distinguishable Speaker Anonymization Through Latent Space Transformation

Add code
Oct 08, 2023
Viaarxiv icon

PP-MeT: a Real-world Personalized Prompt based Meeting Transcription System

Add code
Sep 28, 2023
Figure 1 for PP-MeT: a Real-world Personalized Prompt based Meeting Transcription System
Figure 2 for PP-MeT: a Real-world Personalized Prompt based Meeting Transcription System
Figure 3 for PP-MeT: a Real-world Personalized Prompt based Meeting Transcription System
Figure 4 for PP-MeT: a Real-world Personalized Prompt based Meeting Transcription System
Viaarxiv icon

PromptVC: Flexible Stylistic Voice Conversion in Latent Space Driven by Natural Language Prompts

Add code
Sep 17, 2023
Figure 1 for PromptVC: Flexible Stylistic Voice Conversion in Latent Space Driven by Natural Language Prompts
Figure 2 for PromptVC: Flexible Stylistic Voice Conversion in Latent Space Driven by Natural Language Prompts
Figure 3 for PromptVC: Flexible Stylistic Voice Conversion in Latent Space Driven by Natural Language Prompts
Figure 4 for PromptVC: Flexible Stylistic Voice Conversion in Latent Space Driven by Natural Language Prompts
Viaarxiv icon

DiaCorrect: Error Correction Back-end For Speaker Diarization

Add code
Sep 15, 2023
Figure 1 for DiaCorrect: Error Correction Back-end For Speaker Diarization
Figure 2 for DiaCorrect: Error Correction Back-end For Speaker Diarization
Figure 3 for DiaCorrect: Error Correction Back-end For Speaker Diarization
Figure 4 for DiaCorrect: Error Correction Back-end For Speaker Diarization
Viaarxiv icon

METTS: Multilingual Emotional Text-to-Speech by Cross-speaker and Cross-lingual Emotion Transfer

Add code
Jul 29, 2023
Figure 1 for METTS: Multilingual Emotional Text-to-Speech by Cross-speaker and Cross-lingual Emotion Transfer
Figure 2 for METTS: Multilingual Emotional Text-to-Speech by Cross-speaker and Cross-lingual Emotion Transfer
Figure 3 for METTS: Multilingual Emotional Text-to-Speech by Cross-speaker and Cross-lingual Emotion Transfer
Figure 4 for METTS: Multilingual Emotional Text-to-Speech by Cross-speaker and Cross-lingual Emotion Transfer
Viaarxiv icon

GEmo-CLAP: Gender-Attribute-Enhanced Contrastive Language-Audio Pretraining for Speech Emotion Recognition

Add code
Jun 16, 2023
Figure 1 for GEmo-CLAP: Gender-Attribute-Enhanced Contrastive Language-Audio Pretraining for Speech Emotion Recognition
Figure 2 for GEmo-CLAP: Gender-Attribute-Enhanced Contrastive Language-Audio Pretraining for Speech Emotion Recognition
Figure 3 for GEmo-CLAP: Gender-Attribute-Enhanced Contrastive Language-Audio Pretraining for Speech Emotion Recognition
Viaarxiv icon