Alert button

"speech recognition": models, code, and papers
Alert button

Dissecting User-Perceived Latency of On-Device E2E Speech Recognition

Apr 06, 2021
Yuan Shangguan, Rohit Prabhavalkar, Hang Su, Jay Mahadeokar, Yangyang Shi, Jiatong Zhou, Chunyang Wu, Duc Le, Ozlem Kalinli, Christian Fuegen, Michael L. Seltzer

Figure 1 for Dissecting User-Perceived Latency of On-Device E2E Speech Recognition
Figure 2 for Dissecting User-Perceived Latency of On-Device E2E Speech Recognition
Figure 3 for Dissecting User-Perceived Latency of On-Device E2E Speech Recognition
Figure 4 for Dissecting User-Perceived Latency of On-Device E2E Speech Recognition
Viaarxiv icon

CarneliNet: Neural Mixture Model for Automatic Speech Recognition

Jul 22, 2021
Aleksei Kalinov, Somshubra Majumdar, Jagadeesh Balam, Boris Ginsburg

Figure 1 for CarneliNet: Neural Mixture Model for Automatic Speech Recognition
Figure 2 for CarneliNet: Neural Mixture Model for Automatic Speech Recognition
Figure 3 for CarneliNet: Neural Mixture Model for Automatic Speech Recognition
Figure 4 for CarneliNet: Neural Mixture Model for Automatic Speech Recognition
Viaarxiv icon

A Complementary Joint Training Approach Using Unpaired Speech and Text for Low-Resource Automatic Speech Recognition

Apr 05, 2022
Ye-Qian Du, Jie Zhang, Qiu-Shi Zhu, Li-Rong Dai, Ming-Hui Wu, Xin Fang, Zhou-Wang Yang

Figure 1 for A Complementary Joint Training Approach Using Unpaired Speech and Text for Low-Resource Automatic Speech Recognition
Figure 2 for A Complementary Joint Training Approach Using Unpaired Speech and Text for Low-Resource Automatic Speech Recognition
Figure 3 for A Complementary Joint Training Approach Using Unpaired Speech and Text for Low-Resource Automatic Speech Recognition
Figure 4 for A Complementary Joint Training Approach Using Unpaired Speech and Text for Low-Resource Automatic Speech Recognition
Viaarxiv icon

Physics-inspired Neuroacoustic Computing Based on Tunable Nonlinear Multiple-scattering

Apr 17, 2023
Ali Momeni, Xinxin Guo, Herve Lissek, Romain Fleury

Figure 1 for Physics-inspired Neuroacoustic Computing Based on Tunable Nonlinear Multiple-scattering
Figure 2 for Physics-inspired Neuroacoustic Computing Based on Tunable Nonlinear Multiple-scattering
Figure 3 for Physics-inspired Neuroacoustic Computing Based on Tunable Nonlinear Multiple-scattering
Figure 4 for Physics-inspired Neuroacoustic Computing Based on Tunable Nonlinear Multiple-scattering
Viaarxiv icon

A Comparison of Temporal Encoders for Neuromorphic Keyword Spotting with Few Neurons

Jan 24, 2023
Mattias Nilsson, Ton Juny Pina, Lyes Khacef, Foteini Liwicki, Elisabetta Chicca, Fredrik Sandin

Figure 1 for A Comparison of Temporal Encoders for Neuromorphic Keyword Spotting with Few Neurons
Figure 2 for A Comparison of Temporal Encoders for Neuromorphic Keyword Spotting with Few Neurons
Figure 3 for A Comparison of Temporal Encoders for Neuromorphic Keyword Spotting with Few Neurons
Figure 4 for A Comparison of Temporal Encoders for Neuromorphic Keyword Spotting with Few Neurons
Viaarxiv icon

Towards Contextual Spelling Correction for Customization of End-to-end Speech Recognition Systems

Mar 02, 2022
Xiaoqiang Wang, Yanqing Liu, Jinyu Li, Veljko Miljanic, Sheng Zhao, Hosam Khalil

Figure 1 for Towards Contextual Spelling Correction for Customization of End-to-end Speech Recognition Systems
Figure 2 for Towards Contextual Spelling Correction for Customization of End-to-end Speech Recognition Systems
Figure 3 for Towards Contextual Spelling Correction for Customization of End-to-end Speech Recognition Systems
Figure 4 for Towards Contextual Spelling Correction for Customization of End-to-end Speech Recognition Systems
Viaarxiv icon

Automatic Speech Recognition Datasets in Cantonese Language: A Survey and a New Dataset

Jan 07, 2022
Tiezheng Yu, Rita Frieske, Peng Xu, Samuel Cahyawijaya, Cheuk Tung Shadow Yiu, Holy Lovenia, Wenliang Dai, Elham J. Barezi, Qifeng Chen, Xiaojuan Ma, Bertram E. Shi, Pascale Fung

Figure 1 for Automatic Speech Recognition Datasets in Cantonese Language: A Survey and a New Dataset
Figure 2 for Automatic Speech Recognition Datasets in Cantonese Language: A Survey and a New Dataset
Figure 3 for Automatic Speech Recognition Datasets in Cantonese Language: A Survey and a New Dataset
Figure 4 for Automatic Speech Recognition Datasets in Cantonese Language: A Survey and a New Dataset
Viaarxiv icon

Amortized Neural Networks for Low-Latency Speech Recognition

Aug 03, 2021
Jonathan Macoskey, Grant P. Strimel, Jinru Su, Ariya Rastrow

Figure 1 for Amortized Neural Networks for Low-Latency Speech Recognition
Figure 2 for Amortized Neural Networks for Low-Latency Speech Recognition
Figure 3 for Amortized Neural Networks for Low-Latency Speech Recognition
Viaarxiv icon

Semi-supervised transfer learning for language expansion of end-to-end speech recognition models to low-resource languages

Nov 19, 2021
Jiyeon Kim, Mehul Kumar, Dhananjaya Gowda, Abhinav Garg, Chanwoo Kim

Figure 1 for Semi-supervised transfer learning for language expansion of end-to-end speech recognition models to low-resource languages
Figure 2 for Semi-supervised transfer learning for language expansion of end-to-end speech recognition models to low-resource languages
Figure 3 for Semi-supervised transfer learning for language expansion of end-to-end speech recognition models to low-resource languages
Figure 4 for Semi-supervised transfer learning for language expansion of end-to-end speech recognition models to low-resource languages
Viaarxiv icon

Applying wav2vec2.0 to Speech Recognition in various low-resource languages

Dec 22, 2020
Cheng Yi, Jianzhong Wang, Ning Cheng, Shiyu Zhou, Bo Xu

Figure 1 for Applying wav2vec2.0 to Speech Recognition in various low-resource languages
Figure 2 for Applying wav2vec2.0 to Speech Recognition in various low-resource languages
Figure 3 for Applying wav2vec2.0 to Speech Recognition in various low-resource languages
Figure 4 for Applying wav2vec2.0 to Speech Recognition in various low-resource languages
Viaarxiv icon