Alert button

"speech recognition": models, code, and papers
Alert button

Amortized Neural Networks for Low-Latency Speech Recognition

Aug 03, 2021
Jonathan Macoskey, Grant P. Strimel, Jinru Su, Ariya Rastrow

Figure 1 for Amortized Neural Networks for Low-Latency Speech Recognition
Figure 2 for Amortized Neural Networks for Low-Latency Speech Recognition
Figure 3 for Amortized Neural Networks for Low-Latency Speech Recognition
Viaarxiv icon

Performance-Efficiency Trade-offs in Unsupervised Pre-training for Speech Recognition

Sep 14, 2021
Felix Wu, Kwangyoun Kim, Jing Pan, Kyu Han, Kilian Q. Weinberger, Yoav Artzi

Figure 1 for Performance-Efficiency Trade-offs in Unsupervised Pre-training for Speech Recognition
Figure 2 for Performance-Efficiency Trade-offs in Unsupervised Pre-training for Speech Recognition
Figure 3 for Performance-Efficiency Trade-offs in Unsupervised Pre-training for Speech Recognition
Figure 4 for Performance-Efficiency Trade-offs in Unsupervised Pre-training for Speech Recognition
Viaarxiv icon

Effectiveness of text to speech pseudo labels for forced alignment and cross lingual pretrained models for low resource speech recognition

Mar 31, 2022
Anirudh Gupta, Rishabh Gaur, Ankur Dhuriya, Harveen Singh Chadha, Neeraj Chhimwal, Priyanshi Shah, Vivek Raghavan

Figure 1 for Effectiveness of text to speech pseudo labels for forced alignment and cross lingual pretrained models for low resource speech recognition
Figure 2 for Effectiveness of text to speech pseudo labels for forced alignment and cross lingual pretrained models for low resource speech recognition
Figure 3 for Effectiveness of text to speech pseudo labels for forced alignment and cross lingual pretrained models for low resource speech recognition
Figure 4 for Effectiveness of text to speech pseudo labels for forced alignment and cross lingual pretrained models for low resource speech recognition
Viaarxiv icon

Applying wav2vec2.0 to Speech Recognition in various low-resource languages

Dec 22, 2020
Cheng Yi, Jianzhong Wang, Ning Cheng, Shiyu Zhou, Bo Xu

Figure 1 for Applying wav2vec2.0 to Speech Recognition in various low-resource languages
Figure 2 for Applying wav2vec2.0 to Speech Recognition in various low-resource languages
Figure 3 for Applying wav2vec2.0 to Speech Recognition in various low-resource languages
Figure 4 for Applying wav2vec2.0 to Speech Recognition in various low-resource languages
Viaarxiv icon

KoSpeech: Open-Source Toolkit for End-to-End Korean Speech Recognition

Sep 07, 2020
Soohwan Kim, Seyoung Bae, Cheolhwang Won

Figure 1 for KoSpeech: Open-Source Toolkit for End-to-End Korean Speech Recognition
Figure 2 for KoSpeech: Open-Source Toolkit for End-to-End Korean Speech Recognition
Figure 3 for KoSpeech: Open-Source Toolkit for End-to-End Korean Speech Recognition
Figure 4 for KoSpeech: Open-Source Toolkit for End-to-End Korean Speech Recognition
Viaarxiv icon

Improved disentangled speech representations using contrastive learning in factorized hierarchical variational autoencoder

Nov 15, 2022
Yuying Xie, Thomas Arildsen, Zheng-Hua Tan

Figure 1 for Improved disentangled speech representations using contrastive learning in factorized hierarchical variational autoencoder
Figure 2 for Improved disentangled speech representations using contrastive learning in factorized hierarchical variational autoencoder
Figure 3 for Improved disentangled speech representations using contrastive learning in factorized hierarchical variational autoencoder
Figure 4 for Improved disentangled speech representations using contrastive learning in factorized hierarchical variational autoencoder
Viaarxiv icon

Exploiting Single-Channel Speech for Multi-Channel End-to-End Speech Recognition: A Comparative Study

Mar 31, 2022
Keyu An, Zhijian Ou

Figure 1 for Exploiting Single-Channel Speech for Multi-Channel End-to-End Speech Recognition: A Comparative Study
Figure 2 for Exploiting Single-Channel Speech for Multi-Channel End-to-End Speech Recognition: A Comparative Study
Figure 3 for Exploiting Single-Channel Speech for Multi-Channel End-to-End Speech Recognition: A Comparative Study
Figure 4 for Exploiting Single-Channel Speech for Multi-Channel End-to-End Speech Recognition: A Comparative Study
Viaarxiv icon

Exploring Effective Fusion Algorithms for Speech Based Self-Supervised Learning Models

Dec 20, 2022
Changli Tang, Yujin Wang, Xie Chen, Wei-Qiang Zhang

Figure 1 for Exploring Effective Fusion Algorithms for Speech Based Self-Supervised Learning Models
Figure 2 for Exploring Effective Fusion Algorithms for Speech Based Self-Supervised Learning Models
Figure 3 for Exploring Effective Fusion Algorithms for Speech Based Self-Supervised Learning Models
Figure 4 for Exploring Effective Fusion Algorithms for Speech Based Self-Supervised Learning Models
Viaarxiv icon

SRU++: Pioneering Fast Recurrence with Attention for Speech Recognition

Oct 11, 2021
Jing Pan, Tao Lei, Kwangyoun Kim, Kyu Han, Shinji Watanabe

Figure 1 for SRU++: Pioneering Fast Recurrence with Attention for Speech Recognition
Figure 2 for SRU++: Pioneering Fast Recurrence with Attention for Speech Recognition
Figure 3 for SRU++: Pioneering Fast Recurrence with Attention for Speech Recognition
Figure 4 for SRU++: Pioneering Fast Recurrence with Attention for Speech Recognition
Viaarxiv icon

Advancing Speech Recognition With No Speech Or With Noisy Speech

Jun 17, 2019
Gautam Krishna, Co Tran, Mason Carnahan, Ahmed H Tewfik

Figure 1 for Advancing Speech Recognition With No Speech Or With Noisy Speech
Figure 2 for Advancing Speech Recognition With No Speech Or With Noisy Speech
Figure 3 for Advancing Speech Recognition With No Speech Or With Noisy Speech
Figure 4 for Advancing Speech Recognition With No Speech Or With Noisy Speech
Viaarxiv icon