Picture for Yanmin Qian

Yanmin Qian

WeSep: A Scalable and Flexible Toolkit Towards Generalizable Target Speaker Extraction

Add code
Sep 24, 2024
Figure 1 for WeSep: A Scalable and Flexible Toolkit Towards Generalizable Target Speaker Extraction
Figure 2 for WeSep: A Scalable and Flexible Toolkit Towards Generalizable Target Speaker Extraction
Figure 3 for WeSep: A Scalable and Flexible Toolkit Towards Generalizable Target Speaker Extraction
Figure 4 for WeSep: A Scalable and Flexible Toolkit Towards Generalizable Target Speaker Extraction
Viaarxiv icon

Improving Anomalous Sound Detection via Low-Rank Adaptation Fine-Tuning of Pre-Trained Audio Models

Add code
Sep 11, 2024
Figure 1 for Improving Anomalous Sound Detection via Low-Rank Adaptation Fine-Tuning of Pre-Trained Audio Models
Figure 2 for Improving Anomalous Sound Detection via Low-Rank Adaptation Fine-Tuning of Pre-Trained Audio Models
Figure 3 for Improving Anomalous Sound Detection via Low-Rank Adaptation Fine-Tuning of Pre-Trained Audio Models
Figure 4 for Improving Anomalous Sound Detection via Low-Rank Adaptation Fine-Tuning of Pre-Trained Audio Models
Viaarxiv icon

Disentangling the Prosody and Semantic Information with Pre-trained Model for In-Context Learning based Zero-Shot Voice Conversion

Add code
Sep 10, 2024
Viaarxiv icon

Flow-TSVAD: Target-Speaker Voice Activity Detection via Latent Flow Matching

Add code
Sep 07, 2024
Figure 1 for Flow-TSVAD: Target-Speaker Voice Activity Detection via Latent Flow Matching
Figure 2 for Flow-TSVAD: Target-Speaker Voice Activity Detection via Latent Flow Matching
Figure 3 for Flow-TSVAD: Target-Speaker Voice Activity Detection via Latent Flow Matching
Figure 4 for Flow-TSVAD: Target-Speaker Voice Activity Detection via Latent Flow Matching
Viaarxiv icon

Overview of Speaker Modeling and Its Applications: From the Lens of Deep Speaker Representation Learning

Add code
Jul 21, 2024
Figure 1 for Overview of Speaker Modeling and Its Applications: From the Lens of Deep Speaker Representation Learning
Figure 2 for Overview of Speaker Modeling and Its Applications: From the Lens of Deep Speaker Representation Learning
Figure 3 for Overview of Speaker Modeling and Its Applications: From the Lens of Deep Speaker Representation Learning
Figure 4 for Overview of Speaker Modeling and Its Applications: From the Lens of Deep Speaker Representation Learning
Viaarxiv icon

Diffusion-based Generative Modeling with Discriminative Guidance for Streamable Speech Enhancement

Add code
Jun 19, 2024
Figure 1 for Diffusion-based Generative Modeling with Discriminative Guidance for Streamable Speech Enhancement
Figure 2 for Diffusion-based Generative Modeling with Discriminative Guidance for Streamable Speech Enhancement
Viaarxiv icon

AnoPatch: Towards Better Consistency in Machine Anomalous Sound Detection

Add code
Jun 17, 2024
Figure 1 for AnoPatch: Towards Better Consistency in Machine Anomalous Sound Detection
Figure 2 for AnoPatch: Towards Better Consistency in Machine Anomalous Sound Detection
Figure 3 for AnoPatch: Towards Better Consistency in Machine Anomalous Sound Detection
Figure 4 for AnoPatch: Towards Better Consistency in Machine Anomalous Sound Detection
Viaarxiv icon

Generating Speakers by Prompting Listener Impressions for Pre-trained Multi-Speaker Text-to-Speech Systems

Add code
Jun 13, 2024
Figure 1 for Generating Speakers by Prompting Listener Impressions for Pre-trained Multi-Speaker Text-to-Speech Systems
Figure 2 for Generating Speakers by Prompting Listener Impressions for Pre-trained Multi-Speaker Text-to-Speech Systems
Figure 3 for Generating Speakers by Prompting Listener Impressions for Pre-trained Multi-Speaker Text-to-Speech Systems
Figure 4 for Generating Speakers by Prompting Listener Impressions for Pre-trained Multi-Speaker Text-to-Speech Systems
Viaarxiv icon

Target Speech Diarization with Multimodal Prompts

Add code
Jun 11, 2024
Viaarxiv icon

Towards Lightweight Speaker Verification via Adaptive Neural Network Quantization

Add code
Jun 08, 2024
Figure 1 for Towards Lightweight Speaker Verification via Adaptive Neural Network Quantization
Figure 2 for Towards Lightweight Speaker Verification via Adaptive Neural Network Quantization
Figure 3 for Towards Lightweight Speaker Verification via Adaptive Neural Network Quantization
Figure 4 for Towards Lightweight Speaker Verification via Adaptive Neural Network Quantization
Viaarxiv icon