Picture for Heng Lu

Heng Lu

GMP-ATL: Gender-augmented Multi-scale Pseudo-label Enhanced Adaptive Transfer Learning for Speech Emotion Recognition via HuBERT

May 03, 2024
Viaarxiv icon

Vec-Tok Speech: speech vectorization and tokenization for neural speech generation

Add code
Oct 12, 2023
Viaarxiv icon

SALT: Distinguishable Speaker Anonymization Through Latent Space Transformation

Add code
Oct 08, 2023
Viaarxiv icon

PP-MeT: a Real-world Personalized Prompt based Meeting Transcription System

Add code
Sep 28, 2023
Figure 1 for PP-MeT: a Real-world Personalized Prompt based Meeting Transcription System
Figure 2 for PP-MeT: a Real-world Personalized Prompt based Meeting Transcription System
Figure 3 for PP-MeT: a Real-world Personalized Prompt based Meeting Transcription System
Figure 4 for PP-MeT: a Real-world Personalized Prompt based Meeting Transcription System
Viaarxiv icon

PromptVC: Flexible Stylistic Voice Conversion in Latent Space Driven by Natural Language Prompts

Add code
Sep 17, 2023
Figure 1 for PromptVC: Flexible Stylistic Voice Conversion in Latent Space Driven by Natural Language Prompts
Figure 2 for PromptVC: Flexible Stylistic Voice Conversion in Latent Space Driven by Natural Language Prompts
Figure 3 for PromptVC: Flexible Stylistic Voice Conversion in Latent Space Driven by Natural Language Prompts
Figure 4 for PromptVC: Flexible Stylistic Voice Conversion in Latent Space Driven by Natural Language Prompts
Viaarxiv icon

DiaCorrect: Error Correction Back-end For Speaker Diarization

Add code
Sep 15, 2023
Figure 1 for DiaCorrect: Error Correction Back-end For Speaker Diarization
Figure 2 for DiaCorrect: Error Correction Back-end For Speaker Diarization
Figure 3 for DiaCorrect: Error Correction Back-end For Speaker Diarization
Figure 4 for DiaCorrect: Error Correction Back-end For Speaker Diarization
Viaarxiv icon

METTS: Multilingual Emotional Text-to-Speech by Cross-speaker and Cross-lingual Emotion Transfer

Add code
Jul 29, 2023
Figure 1 for METTS: Multilingual Emotional Text-to-Speech by Cross-speaker and Cross-lingual Emotion Transfer
Figure 2 for METTS: Multilingual Emotional Text-to-Speech by Cross-speaker and Cross-lingual Emotion Transfer
Figure 3 for METTS: Multilingual Emotional Text-to-Speech by Cross-speaker and Cross-lingual Emotion Transfer
Figure 4 for METTS: Multilingual Emotional Text-to-Speech by Cross-speaker and Cross-lingual Emotion Transfer
Viaarxiv icon

GEmo-CLAP: Gender-Attribute-Enhanced Contrastive Language-Audio Pretraining for Speech Emotion Recognition

Jun 16, 2023
Figure 1 for GEmo-CLAP: Gender-Attribute-Enhanced Contrastive Language-Audio Pretraining for Speech Emotion Recognition
Figure 2 for GEmo-CLAP: Gender-Attribute-Enhanced Contrastive Language-Audio Pretraining for Speech Emotion Recognition
Figure 3 for GEmo-CLAP: Gender-Attribute-Enhanced Contrastive Language-Audio Pretraining for Speech Emotion Recognition
Viaarxiv icon

HYBRIDFORMER: improving SqueezeFormer with hybrid attention and NSR mechanism

Add code
Mar 15, 2023
Figure 1 for HYBRIDFORMER: improving SqueezeFormer with hybrid attention and NSR mechanism
Figure 2 for HYBRIDFORMER: improving SqueezeFormer with hybrid attention and NSR mechanism
Figure 3 for HYBRIDFORMER: improving SqueezeFormer with hybrid attention and NSR mechanism
Figure 4 for HYBRIDFORMER: improving SqueezeFormer with hybrid attention and NSR mechanism
Viaarxiv icon

LMEC: Learnable Multiplicative Absolute Position Embedding Based Conformer for Speech Recognition

Dec 05, 2022
Figure 1 for LMEC: Learnable Multiplicative Absolute Position Embedding Based Conformer for Speech Recognition
Figure 2 for LMEC: Learnable Multiplicative Absolute Position Embedding Based Conformer for Speech Recognition
Figure 3 for LMEC: Learnable Multiplicative Absolute Position Embedding Based Conformer for Speech Recognition
Figure 4 for LMEC: Learnable Multiplicative Absolute Position Embedding Based Conformer for Speech Recognition
Viaarxiv icon