Picture for Zhengyang Chen

Zhengyang Chen

Disentangling the Prosody and Semantic Information with Pre-trained Model for In-Context Learning based Zero-Shot Voice Conversion

Add code
Sep 10, 2024
Viaarxiv icon

Flow-TSVAD: Target-Speaker Voice Activity Detection via Latent Flow Matching

Add code
Sep 07, 2024
Viaarxiv icon

Overview of Speaker Modeling and Its Applications: From the Lens of Deep Speaker Representation Learning

Add code
Jul 21, 2024
Viaarxiv icon

Generating Speakers by Prompting Listener Impressions for Pre-trained Multi-Speaker Text-to-Speech Systems

Add code
Jun 13, 2024
Viaarxiv icon

Target Speech Diarization with Multimodal Prompts

Add code
Jun 11, 2024
Viaarxiv icon

Prompt-driven Target Speech Diarization

Add code
Oct 23, 2023
Viaarxiv icon

Leveraging In-the-Wild Data for Effective Self-Supervised Pretraining in Speaker Recognition

Add code
Sep 27, 2023
Viaarxiv icon

Attention-based Encoder-Decoder End-to-End Neural Diarization with Embedding Enhancer

Add code
Sep 13, 2023
Viaarxiv icon

Exploring Binary Classification Loss For Speaker Verification

Add code
Jul 17, 2023
Viaarxiv icon

Wespeaker baselines for VoxSRC2023

Add code
Jun 28, 2023
Viaarxiv icon