Alert button

"speech": models, code, and papers
Alert button

INTERSPEECH 2021 ConferencingSpeech Challenge: Towards Far-field Multi-Channel Speech Enhancement for Video Conferencing

Add code
Bookmark button
Alert button
Apr 02, 2021
Wei Rao, Yihui Fu, Yanxin Hu, Xin Xu, Yvkai Jv, Jiangyu Han, Zhongjie Jiang, Lei Xie, Yannan Wang, Shinji Watanabe, Zheng-Hua Tan, Hui Bu, Tao Yu, Shidong Shang

Figure 1 for INTERSPEECH 2021 ConferencingSpeech Challenge: Towards Far-field Multi-Channel Speech Enhancement for Video Conferencing
Figure 2 for INTERSPEECH 2021 ConferencingSpeech Challenge: Towards Far-field Multi-Channel Speech Enhancement for Video Conferencing
Figure 3 for INTERSPEECH 2021 ConferencingSpeech Challenge: Towards Far-field Multi-Channel Speech Enhancement for Video Conferencing
Viaarxiv icon

Data Augmentation based Consistency Contrastive Pre-training for Automatic Speech Recognition

Dec 23, 2021
Changfeng Gao, Gaofeng Cheng, Yifan Guo, Qingwei Zhao, Pengyuan Zhang

Figure 1 for Data Augmentation based Consistency Contrastive Pre-training for Automatic Speech Recognition
Figure 2 for Data Augmentation based Consistency Contrastive Pre-training for Automatic Speech Recognition
Figure 3 for Data Augmentation based Consistency Contrastive Pre-training for Automatic Speech Recognition
Figure 4 for Data Augmentation based Consistency Contrastive Pre-training for Automatic Speech Recognition
Viaarxiv icon

Distilling the Knowledge of BERT for CTC-based ASR

Sep 05, 2022
Hayato Futami, Hirofumi Inaguma, Masato Mimura, Shinsuke Sakai, Tatsuya Kawahara

Figure 1 for Distilling the Knowledge of BERT for CTC-based ASR
Figure 2 for Distilling the Knowledge of BERT for CTC-based ASR
Figure 3 for Distilling the Knowledge of BERT for CTC-based ASR
Figure 4 for Distilling the Knowledge of BERT for CTC-based ASR
Viaarxiv icon

The Analysis of Synonymy and Antonymy in Discourse Relations: An interpretable Modeling Approach

Aug 09, 2022
A. Reig-Alamillo, D. Torres-Moreno, E. Morales-González, M. Toledo-Acosta, A. Taroni, J. Hermosillo-Valadez

Figure 1 for The Analysis of Synonymy and Antonymy in Discourse Relations: An interpretable Modeling Approach
Figure 2 for The Analysis of Synonymy and Antonymy in Discourse Relations: An interpretable Modeling Approach
Figure 3 for The Analysis of Synonymy and Antonymy in Discourse Relations: An interpretable Modeling Approach
Figure 4 for The Analysis of Synonymy and Antonymy in Discourse Relations: An interpretable Modeling Approach
Viaarxiv icon

Bridging the prosody GAP: Genetic Algorithm with People to efficiently sample emotional prosody

Add code
Bookmark button
Alert button
May 10, 2022
Pol van Rijn, Harin Lee, Nori Jacoby

Figure 1 for Bridging the prosody GAP: Genetic Algorithm with People to efficiently sample emotional prosody
Figure 2 for Bridging the prosody GAP: Genetic Algorithm with People to efficiently sample emotional prosody
Figure 3 for Bridging the prosody GAP: Genetic Algorithm with People to efficiently sample emotional prosody
Viaarxiv icon

Consistent Training and Decoding For End-to-end Speech Recognition Using Lattice-free MMI

Add code
Bookmark button
Alert button
Dec 30, 2021
Jinchuan Tian, Jianwei Yu, Chao Weng, Shi-Xiong Zhang, Dan Su, Dong Yu, Yuexian Zou

Figure 1 for Consistent Training and Decoding For End-to-end Speech Recognition Using Lattice-free MMI
Figure 2 for Consistent Training and Decoding For End-to-end Speech Recognition Using Lattice-free MMI
Figure 3 for Consistent Training and Decoding For End-to-end Speech Recognition Using Lattice-free MMI
Figure 4 for Consistent Training and Decoding For End-to-end Speech Recognition Using Lattice-free MMI
Viaarxiv icon

Direct Noisy Speech Modeling for Noisy-to-Noisy Voice Conversion

Add code
Bookmark button
Alert button
Nov 13, 2021
Chao Xie, Yi-Chiao Wu, Patrick Lumban Tobing, Wen-Chin Huang, Tomoki Toda

Figure 1 for Direct Noisy Speech Modeling for Noisy-to-Noisy Voice Conversion
Figure 2 for Direct Noisy Speech Modeling for Noisy-to-Noisy Voice Conversion
Figure 3 for Direct Noisy Speech Modeling for Noisy-to-Noisy Voice Conversion
Figure 4 for Direct Noisy Speech Modeling for Noisy-to-Noisy Voice Conversion
Viaarxiv icon

MixSpeech: Data Augmentation for Low-resource Automatic Speech Recognition

Add code
Bookmark button
Alert button
Feb 25, 2021
Linghui Meng, Jin Xu, Xu Tan, Jindong Wang, Tao Qin, Bo Xu

Figure 1 for MixSpeech: Data Augmentation for Low-resource Automatic Speech Recognition
Figure 2 for MixSpeech: Data Augmentation for Low-resource Automatic Speech Recognition
Figure 3 for MixSpeech: Data Augmentation for Low-resource Automatic Speech Recognition
Viaarxiv icon

Voice Spoofing Countermeasures: Taxonomy, State-of-the-art, experimental analysis of generalizability, open challenges, and the way forward

Add code
Bookmark button
Alert button
Oct 02, 2022
Awais Khan, Khalid Mahmood Malik, James Ryan, Mikul Saravanan

Figure 1 for Voice Spoofing Countermeasures: Taxonomy, State-of-the-art, experimental analysis of generalizability, open challenges, and the way forward
Figure 2 for Voice Spoofing Countermeasures: Taxonomy, State-of-the-art, experimental analysis of generalizability, open challenges, and the way forward
Figure 3 for Voice Spoofing Countermeasures: Taxonomy, State-of-the-art, experimental analysis of generalizability, open challenges, and the way forward
Figure 4 for Voice Spoofing Countermeasures: Taxonomy, State-of-the-art, experimental analysis of generalizability, open challenges, and the way forward
Viaarxiv icon

Unsupervised Voice Activity Detection by Modeling Source and System Information using Zero Frequency Filtering

Jun 27, 2022
Eklavya Sarkar, RaviShankar Prasad, Mathew Magimai. -Doss

Figure 1 for Unsupervised Voice Activity Detection by Modeling Source and System Information using Zero Frequency Filtering
Figure 2 for Unsupervised Voice Activity Detection by Modeling Source and System Information using Zero Frequency Filtering
Figure 3 for Unsupervised Voice Activity Detection by Modeling Source and System Information using Zero Frequency Filtering
Figure 4 for Unsupervised Voice Activity Detection by Modeling Source and System Information using Zero Frequency Filtering
Viaarxiv icon