Picture for Yanmin Qian

Yanmin Qian

URGENT Challenge: Universality, Robustness, and Generalizability For Speech Enhancement

Add code
Jun 07, 2024
Figure 1 for URGENT Challenge: Universality, Robustness, and Generalizability For Speech Enhancement
Figure 2 for URGENT Challenge: Universality, Robustness, and Generalizability For Speech Enhancement
Figure 3 for URGENT Challenge: Universality, Robustness, and Generalizability For Speech Enhancement
Figure 4 for URGENT Challenge: Universality, Robustness, and Generalizability For Speech Enhancement
Viaarxiv icon

Beyond Performance Plateaus: A Comprehensive Study on Scalability in Speech Enhancement

Add code
Jun 06, 2024
Figure 1 for Beyond Performance Plateaus: A Comprehensive Study on Scalability in Speech Enhancement
Figure 2 for Beyond Performance Plateaus: A Comprehensive Study on Scalability in Speech Enhancement
Figure 3 for Beyond Performance Plateaus: A Comprehensive Study on Scalability in Speech Enhancement
Figure 4 for Beyond Performance Plateaus: A Comprehensive Study on Scalability in Speech Enhancement
Viaarxiv icon

TransVIP: Speech to Speech Translation System with Voice and Isochrony Preservation

Add code
May 28, 2024
Viaarxiv icon

CLAQ: Pushing the Limits of Low-Bit Post-Training Quantization for LLMs

Add code
May 27, 2024
Viaarxiv icon

GSTalker: Real-time Audio-Driven Talking Face Generation via Deformable Gaussian Splatting

Add code
Apr 29, 2024
Viaarxiv icon

CoVoMix: Advancing Zero-Shot Speech Generation for Human-like Multi-talker Conversations

Add code
Apr 10, 2024
Figure 1 for CoVoMix: Advancing Zero-Shot Speech Generation for Human-like Multi-talker Conversations
Figure 2 for CoVoMix: Advancing Zero-Shot Speech Generation for Human-like Multi-talker Conversations
Figure 3 for CoVoMix: Advancing Zero-Shot Speech Generation for Human-like Multi-talker Conversations
Figure 4 for CoVoMix: Advancing Zero-Shot Speech Generation for Human-like Multi-talker Conversations
Viaarxiv icon

Advanced Long-Content Speech Recognition With Factorized Neural Transducer

Add code
Mar 20, 2024
Figure 1 for Advanced Long-Content Speech Recognition With Factorized Neural Transducer
Figure 2 for Advanced Long-Content Speech Recognition With Factorized Neural Transducer
Figure 3 for Advanced Long-Content Speech Recognition With Factorized Neural Transducer
Figure 4 for Advanced Long-Content Speech Recognition With Factorized Neural Transducer
Viaarxiv icon

Improving Design of Input Condition Invariant Speech Enhancement

Add code
Jan 25, 2024
Viaarxiv icon

FAT-HuBERT: Front-end Adaptive Training of Hidden-unit BERT for Distortion-Invariant Robust Speech Recognition

Add code
Nov 29, 2023
Figure 1 for FAT-HuBERT: Front-end Adaptive Training of Hidden-unit BERT for Distortion-Invariant Robust Speech Recognition
Figure 2 for FAT-HuBERT: Front-end Adaptive Training of Hidden-unit BERT for Distortion-Invariant Robust Speech Recognition
Figure 3 for FAT-HuBERT: Front-end Adaptive Training of Hidden-unit BERT for Distortion-Invariant Robust Speech Recognition
Figure 4 for FAT-HuBERT: Front-end Adaptive Training of Hidden-unit BERT for Distortion-Invariant Robust Speech Recognition
Viaarxiv icon

Prompt-driven Target Speech Diarization

Add code
Oct 23, 2023
Viaarxiv icon