Alert button
Picture for Yanmin Qian

Yanmin Qian

Alert button

Attention-based Encoder-Decoder Network for End-to-End Neural Speaker Diarization with Target Speaker Attractor

Add code
Bookmark button
Alert button
May 18, 2023
Zhengyang Chen, Bing Han, Shuai Wang, Yanmin Qian

Figure 1 for Attention-based Encoder-Decoder Network for End-to-End Neural Speaker Diarization with Target Speaker Attractor
Figure 2 for Attention-based Encoder-Decoder Network for End-to-End Neural Speaker Diarization with Target Speaker Attractor
Figure 3 for Attention-based Encoder-Decoder Network for End-to-End Neural Speaker Diarization with Target Speaker Attractor
Figure 4 for Attention-based Encoder-Decoder Network for End-to-End Neural Speaker Diarization with Target Speaker Attractor
Viaarxiv icon

Self-Supervised Learning with Cluster-Aware-DINO for High-Performance Robust Speaker Verification

Add code
Bookmark button
Alert button
Apr 12, 2023
Bing Han, Zhengyang Chen, Yanmin Qian

Figure 1 for Self-Supervised Learning with Cluster-Aware-DINO for High-Performance Robust Speaker Verification
Figure 2 for Self-Supervised Learning with Cluster-Aware-DINO for High-Performance Robust Speaker Verification
Figure 3 for Self-Supervised Learning with Cluster-Aware-DINO for High-Performance Robust Speaker Verification
Figure 4 for Self-Supervised Learning with Cluster-Aware-DINO for High-Performance Robust Speaker Verification
Viaarxiv icon

Code-Switching Text Generation and Injection in Mandarin-English ASR

Add code
Bookmark button
Alert button
Mar 20, 2023
Haibin Yu, Yuxuan Hu, Yao Qian, Ma Jin, Linquan Liu, Shujie Liu, Yu Shi, Yanmin Qian, Edward Lin, Michael Zeng

Figure 1 for Code-Switching Text Generation and Injection in Mandarin-English ASR
Figure 2 for Code-Switching Text Generation and Injection in Mandarin-English ASR
Figure 3 for Code-Switching Text Generation and Injection in Mandarin-English ASR
Figure 4 for Code-Switching Text Generation and Injection in Mandarin-English ASR
Viaarxiv icon

Target Sound Extraction with Variable Cross-modality Clues

Add code
Bookmark button
Alert button
Mar 15, 2023
Chenda Li, Yao Qian, Zhuo Chen, Dongmei Wang, Takuya Yoshioka, Shujie Liu, Yanmin Qian, Michael Zeng

Figure 1 for Target Sound Extraction with Variable Cross-modality Clues
Figure 2 for Target Sound Extraction with Variable Cross-modality Clues
Figure 3 for Target Sound Extraction with Variable Cross-modality Clues
Figure 4 for Target Sound Extraction with Variable Cross-modality Clues
Viaarxiv icon

LongFNT: Long-form Speech Recognition with Factorized Neural Transducer

Add code
Bookmark button
Alert button
Nov 17, 2022
Xun Gong, Yu Wu, Jinyu Li, Shujie Liu, Rui Zhao, Xie Chen, Yanmin Qian

Figure 1 for LongFNT: Long-form Speech Recognition with Factorized Neural Transducer
Figure 2 for LongFNT: Long-form Speech Recognition with Factorized Neural Transducer
Figure 3 for LongFNT: Long-form Speech Recognition with Factorized Neural Transducer
Figure 4 for LongFNT: Long-form Speech Recognition with Factorized Neural Transducer
Viaarxiv icon

Build a SRE Challenge System: Lessons from VoxSRC 2022 and CNSRC 2022

Add code
Bookmark button
Alert button
Nov 02, 2022
Zhengyang Chen, Bing Han, Xu Xiang, Houjun Huang, Bei Liu, Yanmin Qian

Figure 1 for Build a SRE Challenge System: Lessons from VoxSRC 2022 and CNSRC 2022
Figure 2 for Build a SRE Challenge System: Lessons from VoxSRC 2022 and CNSRC 2022
Figure 3 for Build a SRE Challenge System: Lessons from VoxSRC 2022 and CNSRC 2022
Figure 4 for Build a SRE Challenge System: Lessons from VoxSRC 2022 and CNSRC 2022
Viaarxiv icon

Wespeaker: A Research and Production oriented Speaker Embedding Learning Toolkit

Add code
Bookmark button
Alert button
Nov 01, 2022
Hongji Wang, Chengdong Liang, Shuai Wang, Zhengyang Chen, Binbin Zhang, Xu Xiang, Yanlei Deng, Yanmin Qian

Figure 1 for Wespeaker: A Research and Production oriented Speaker Embedding Learning Toolkit
Figure 2 for Wespeaker: A Research and Production oriented Speaker Embedding Learning Toolkit
Figure 3 for Wespeaker: A Research and Production oriented Speaker Embedding Learning Toolkit
Figure 4 for Wespeaker: A Research and Production oriented Speaker Embedding Learning Toolkit
Viaarxiv icon

A comprehensive study on self-supervised distillation for speaker representation learning

Add code
Bookmark button
Alert button
Oct 28, 2022
Zhengyang Chen, Yao Qian, Bing Han, Yanmin Qian, Michael Zeng

Figure 1 for A comprehensive study on self-supervised distillation for speaker representation learning
Figure 2 for A comprehensive study on self-supervised distillation for speaker representation learning
Figure 3 for A comprehensive study on self-supervised distillation for speaker representation learning
Figure 4 for A comprehensive study on self-supervised distillation for speaker representation learning
Viaarxiv icon

SJTU-AISPEECH System for VoxCeleb Speaker Recognition Challenge 2022

Add code
Bookmark button
Alert button
Sep 20, 2022
Zhengyang Chen, Bing Han, Xu Xiang, Houjun Huang, Bei Liu, Yanmin Qian

Figure 1 for SJTU-AISPEECH System for VoxCeleb Speaker Recognition Challenge 2022
Figure 2 for SJTU-AISPEECH System for VoxCeleb Speaker Recognition Challenge 2022
Figure 3 for SJTU-AISPEECH System for VoxCeleb Speaker Recognition Challenge 2022
Figure 4 for SJTU-AISPEECH System for VoxCeleb Speaker Recognition Challenge 2022
Viaarxiv icon

The Conversational Short-phrase Speaker Diarization (CSSD) Task: Dataset, Evaluation Metric and Baselines

Add code
Bookmark button
Alert button
Aug 17, 2022
Gaofeng Cheng, Yifan Chen, Runyan Yang, Qingxuan Li, Zehui Yang, Lingxuan Ye, Pengyuan Zhang, Qingqing Zhang, Lei Xie, Yanmin Qian, Kong Aik Lee, Yonghong Yan

Figure 1 for The Conversational Short-phrase Speaker Diarization (CSSD) Task: Dataset, Evaluation Metric and Baselines
Figure 2 for The Conversational Short-phrase Speaker Diarization (CSSD) Task: Dataset, Evaluation Metric and Baselines
Figure 3 for The Conversational Short-phrase Speaker Diarization (CSSD) Task: Dataset, Evaluation Metric and Baselines
Figure 4 for The Conversational Short-phrase Speaker Diarization (CSSD) Task: Dataset, Evaluation Metric and Baselines
Viaarxiv icon