Alert button

"speech recognition": models, code, and papers
Alert button

Semi-Supervised Speech Recognition via Local Prior Matching

Feb 24, 2020
Wei-Ning Hsu, Ann Lee, Gabriel Synnaeve, Awni Hannun

Figure 1 for Semi-Supervised Speech Recognition via Local Prior Matching
Figure 2 for Semi-Supervised Speech Recognition via Local Prior Matching
Figure 3 for Semi-Supervised Speech Recognition via Local Prior Matching
Figure 4 for Semi-Supervised Speech Recognition via Local Prior Matching
Viaarxiv icon

Cascade RNN-Transducer: Syllable Based Streaming On-device Mandarin Speech Recognition with a Syllable-to-Character Converter

Nov 17, 2020
Xiong Wang, Zhuoyuan Yao, Xian Shi, Lei Xie

Figure 1 for Cascade RNN-Transducer: Syllable Based Streaming On-device Mandarin Speech Recognition with a Syllable-to-Character Converter
Figure 2 for Cascade RNN-Transducer: Syllable Based Streaming On-device Mandarin Speech Recognition with a Syllable-to-Character Converter
Figure 3 for Cascade RNN-Transducer: Syllable Based Streaming On-device Mandarin Speech Recognition with a Syllable-to-Character Converter
Figure 4 for Cascade RNN-Transducer: Syllable Based Streaming On-device Mandarin Speech Recognition with a Syllable-to-Character Converter
Viaarxiv icon

Q-ASR: Integer-only Zero-shot Quantization for Efficient Speech Recognition

Mar 31, 2021
Sehoon Kim, Amir Gholami, Zhewei Yao, Anirudda Nrusimha, Bohan Zhai, Tianren Gao, Michael W. Mahoney, Kurt Keutzer

Figure 1 for Q-ASR: Integer-only Zero-shot Quantization for Efficient Speech Recognition
Figure 2 for Q-ASR: Integer-only Zero-shot Quantization for Efficient Speech Recognition
Figure 3 for Q-ASR: Integer-only Zero-shot Quantization for Efficient Speech Recognition
Figure 4 for Q-ASR: Integer-only Zero-shot Quantization for Efficient Speech Recognition
Viaarxiv icon

Privacy-preserving Speech Emotion Recognition through Semi-Supervised Federated Learning

Feb 05, 2022
Vasileios Tsouvalas, Tanir Ozcelebi, Nirvana Meratnia

Figure 1 for Privacy-preserving Speech Emotion Recognition through Semi-Supervised Federated Learning
Figure 2 for Privacy-preserving Speech Emotion Recognition through Semi-Supervised Federated Learning
Figure 3 for Privacy-preserving Speech Emotion Recognition through Semi-Supervised Federated Learning
Figure 4 for Privacy-preserving Speech Emotion Recognition through Semi-Supervised Federated Learning
Viaarxiv icon

VoxSRC 2022: The Fourth VoxCeleb Speaker Recognition Challenge

Mar 06, 2023
Jaesung Huh, Andrew Brown, Jee-weon Jung, Joon Son Chung, Arsha Nagrani, Daniel Garcia-Romero, Andrew Zisserman

Figure 1 for VoxSRC 2022: The Fourth VoxCeleb Speaker Recognition Challenge
Figure 2 for VoxSRC 2022: The Fourth VoxCeleb Speaker Recognition Challenge
Figure 3 for VoxSRC 2022: The Fourth VoxCeleb Speaker Recognition Challenge
Figure 4 for VoxSRC 2022: The Fourth VoxCeleb Speaker Recognition Challenge
Viaarxiv icon

SAR-Net: A End-to-End Deep Speech Accent Recognition Network

Nov 25, 2020
Wei Wang, Chao Zhang, Xiaopei Wu

Figure 1 for SAR-Net: A End-to-End Deep Speech Accent Recognition Network
Figure 2 for SAR-Net: A End-to-End Deep Speech Accent Recognition Network
Figure 3 for SAR-Net: A End-to-End Deep Speech Accent Recognition Network
Figure 4 for SAR-Net: A End-to-End Deep Speech Accent Recognition Network
Viaarxiv icon

Weak-Attention Suppression For Transformer Based Speech Recognition

May 18, 2020
Yangyang Shi, Yongqiang Wang, Chunyang Wu, Christian Fuegen, Frank Zhang, Duc Le, Ching-Feng Yeh, Michael L. Seltzer

Figure 1 for Weak-Attention Suppression For Transformer Based Speech Recognition
Figure 2 for Weak-Attention Suppression For Transformer Based Speech Recognition
Figure 3 for Weak-Attention Suppression For Transformer Based Speech Recognition
Figure 4 for Weak-Attention Suppression For Transformer Based Speech Recognition
Viaarxiv icon

Generalizing in the Real World with Representation Learning

Oct 18, 2022
Tegan Maharaj

Figure 1 for Generalizing in the Real World with Representation Learning
Figure 2 for Generalizing in the Real World with Representation Learning
Figure 3 for Generalizing in the Real World with Representation Learning
Figure 4 for Generalizing in the Real World with Representation Learning
Viaarxiv icon

Reducing Geographic Disparities in Automatic Speech Recognition via Elastic Weight Consolidation

Jul 16, 2022
Viet Anh Trinh, Pegah Ghahremani, Brian King, Jasha Droppo, Andreas Stolcke, Roland Maas

Figure 1 for Reducing Geographic Disparities in Automatic Speech Recognition via Elastic Weight Consolidation
Figure 2 for Reducing Geographic Disparities in Automatic Speech Recognition via Elastic Weight Consolidation
Viaarxiv icon

Assessing ASR Model Quality on Disordered Speech using BERTScore

Sep 21, 2022
Jimmy Tobin, Qisheng Li, Subhashini Venugopalan, Katie Seaver, Richard Cave, Katrin Tomanek

Figure 1 for Assessing ASR Model Quality on Disordered Speech using BERTScore
Figure 2 for Assessing ASR Model Quality on Disordered Speech using BERTScore
Figure 3 for Assessing ASR Model Quality on Disordered Speech using BERTScore
Figure 4 for Assessing ASR Model Quality on Disordered Speech using BERTScore
Viaarxiv icon