Picture for Yanmin Qian

Yanmin Qian

Diffusion-based Generative Modeling with Discriminative Guidance for Streamable Speech Enhancement

Add code
Jun 19, 2024
Viaarxiv icon

AnoPatch: Towards Better Consistency in Machine Anomalous Sound Detection

Add code
Jun 17, 2024
Figure 1 for AnoPatch: Towards Better Consistency in Machine Anomalous Sound Detection
Figure 2 for AnoPatch: Towards Better Consistency in Machine Anomalous Sound Detection
Figure 3 for AnoPatch: Towards Better Consistency in Machine Anomalous Sound Detection
Figure 4 for AnoPatch: Towards Better Consistency in Machine Anomalous Sound Detection
Viaarxiv icon

Generating Speakers by Prompting Listener Impressions for Pre-trained Multi-Speaker Text-to-Speech Systems

Add code
Jun 13, 2024
Figure 1 for Generating Speakers by Prompting Listener Impressions for Pre-trained Multi-Speaker Text-to-Speech Systems
Figure 2 for Generating Speakers by Prompting Listener Impressions for Pre-trained Multi-Speaker Text-to-Speech Systems
Figure 3 for Generating Speakers by Prompting Listener Impressions for Pre-trained Multi-Speaker Text-to-Speech Systems
Figure 4 for Generating Speakers by Prompting Listener Impressions for Pre-trained Multi-Speaker Text-to-Speech Systems
Viaarxiv icon

Target Speech Diarization with Multimodal Prompts

Add code
Jun 11, 2024
Viaarxiv icon

Towards Lightweight Speaker Verification via Adaptive Neural Network Quantization

Add code
Jun 08, 2024
Viaarxiv icon

URGENT Challenge: Universality, Robustness, and Generalizability For Speech Enhancement

Add code
Jun 07, 2024
Viaarxiv icon

Beyond Performance Plateaus: A Comprehensive Study on Scalability in Speech Enhancement

Add code
Jun 06, 2024
Viaarxiv icon

TransVIP: Speech to Speech Translation System with Voice and Isochrony Preservation

Add code
May 28, 2024
Viaarxiv icon

CLAQ: Pushing the Limits of Low-Bit Post-Training Quantization for LLMs

Add code
May 27, 2024
Viaarxiv icon

GSTalker: Real-time Audio-Driven Talking Face Generation via Deformable Gaussian Splatting

Add code
Apr 29, 2024
Figure 1 for GSTalker: Real-time Audio-Driven Talking Face Generation via Deformable Gaussian Splatting
Figure 2 for GSTalker: Real-time Audio-Driven Talking Face Generation via Deformable Gaussian Splatting
Figure 3 for GSTalker: Real-time Audio-Driven Talking Face Generation via Deformable Gaussian Splatting
Figure 4 for GSTalker: Real-time Audio-Driven Talking Face Generation via Deformable Gaussian Splatting
Viaarxiv icon