Alert button
Picture for Yanmin Qian

Yanmin Qian

Alert button

CoVoMix: Advancing Zero-Shot Speech Generation for Human-like Multi-talker Conversations

Add code
Bookmark button
Alert button
Apr 10, 2024
Leying Zhang, Yao Qian, Long Zhou, Shujie Liu, Dongmei Wang, Xiaofei Wang, Midia Yousefi, Yanmin Qian, Jinyu Li, Lei He, Sheng Zhao, Michael Zeng

Viaarxiv icon

Advanced Long-Content Speech Recognition With Factorized Neural Transducer

Add code
Bookmark button
Alert button
Mar 20, 2024
Xun Gong, Yu Wu, Jinyu Li, Shujie Liu, Rui Zhao, Xie Chen, Yanmin Qian

Figure 1 for Advanced Long-Content Speech Recognition With Factorized Neural Transducer
Figure 2 for Advanced Long-Content Speech Recognition With Factorized Neural Transducer
Figure 3 for Advanced Long-Content Speech Recognition With Factorized Neural Transducer
Figure 4 for Advanced Long-Content Speech Recognition With Factorized Neural Transducer
Viaarxiv icon

Improving Design of Input Condition Invariant Speech Enhancement

Add code
Bookmark button
Alert button
Jan 25, 2024
Wangyou Zhang, Jee-weon Jung, Shinji Watanabe, Yanmin Qian

Viaarxiv icon

FAT-HuBERT: Front-end Adaptive Training of Hidden-unit BERT for Distortion-Invariant Robust Speech Recognition

Add code
Bookmark button
Alert button
Nov 29, 2023
Dongning Yang, Wei Wang, Yanmin Qian

Viaarxiv icon

Prompt-driven Target Speech Diarization

Add code
Bookmark button
Alert button
Oct 23, 2023
Yidi Jiang, Zhengyang Chen, Ruijie Tao, Liqun Deng, Yanmin Qian, Haizhou Li

Viaarxiv icon

One-Shot Sensitivity-Aware Mixed Sparsity Pruning for Large Language Models

Add code
Bookmark button
Alert button
Oct 14, 2023
Hang Shao, Bei Liu, Yanmin Qian

Viaarxiv icon

Toward Universal Speech Enhancement for Diverse Input Conditions

Add code
Bookmark button
Alert button
Sep 29, 2023
Wangyou Zhang, Kohei Saijo, Zhong-Qiu Wang, Shinji Watanabe, Yanmin Qian

Figure 1 for Toward Universal Speech Enhancement for Diverse Input Conditions
Figure 2 for Toward Universal Speech Enhancement for Diverse Input Conditions
Figure 3 for Toward Universal Speech Enhancement for Diverse Input Conditions
Figure 4 for Toward Universal Speech Enhancement for Diverse Input Conditions
Viaarxiv icon

Leveraging In-the-Wild Data for Effective Self-Supervised Pretraining in Speaker Recognition

Add code
Bookmark button
Alert button
Sep 27, 2023
Shuai Wang, Qibing Bai, Qi Liu, Jianwei Yu, Zhengyang Chen, Bing Han, Yanmin Qian, Haizhou Li

Figure 1 for Leveraging In-the-Wild Data for Effective Self-Supervised Pretraining in Speaker Recognition
Figure 2 for Leveraging In-the-Wild Data for Effective Self-Supervised Pretraining in Speaker Recognition
Figure 3 for Leveraging In-the-Wild Data for Effective Self-Supervised Pretraining in Speaker Recognition
Figure 4 for Leveraging In-the-Wild Data for Effective Self-Supervised Pretraining in Speaker Recognition
Viaarxiv icon

Diffusion Conditional Expectation Model for Efficient and Robust Target Speech Extraction

Add code
Bookmark button
Alert button
Sep 25, 2023
Leying Zhang, Yao Qian, Linfeng Yu, Heming Wang, Xinkai Wang, Hemin Yang, Long Zhou, Shujie Liu, Yanmin Qian, Michael Zeng

Viaarxiv icon

The second multi-channel multi-party meeting transcription challenge (M2MeT) 2.0): A benchmark for speaker-attributed ASR

Add code
Bookmark button
Alert button
Sep 24, 2023
Yuhao Liang, Mohan Shi, Fan Yu, Yangze Li, Shiliang Zhang, Zhihao Du, Qian Chen, Lei Xie, Yanmin Qian, Jian Wu, Zhuo Chen, Kong Aik Lee, Zhijie Yan, Hui Bu

Figure 1 for The second multi-channel multi-party meeting transcription challenge (M2MeT) 2.0): A benchmark for speaker-attributed ASR
Figure 2 for The second multi-channel multi-party meeting transcription challenge (M2MeT) 2.0): A benchmark for speaker-attributed ASR
Figure 3 for The second multi-channel multi-party meeting transcription challenge (M2MeT) 2.0): A benchmark for speaker-attributed ASR
Figure 4 for The second multi-channel multi-party meeting transcription challenge (M2MeT) 2.0): A benchmark for speaker-attributed ASR
Viaarxiv icon