Alert button

"speech recognition": models, code, and papers
Alert button

Lookup-Table Recurrent Language Models for Long Tail Speech Recognition

Apr 09, 2021
W. Ronny Huang, Tara N. Sainath, Cal Peyser, Shankar Kumar, David Rybach, Trevor Strohman

Figure 1 for Lookup-Table Recurrent Language Models for Long Tail Speech Recognition
Figure 2 for Lookup-Table Recurrent Language Models for Long Tail Speech Recognition
Figure 3 for Lookup-Table Recurrent Language Models for Long Tail Speech Recognition
Figure 4 for Lookup-Table Recurrent Language Models for Long Tail Speech Recognition
Viaarxiv icon

Perceptive, non-linear Speech Processing and Spiking Neural Networks

Mar 31, 2022
Jean Rouat, Ramin Pichevar, Stéphane Loiselle

Viaarxiv icon

Jira: a Kurdish Speech Recognition System Designing and Building Speech Corpus and Pronunciation Lexicon

Feb 15, 2021
Hadi Veisi, Hawre Hosseini, Mohammad Mohammadamini, Wirya Fathy, Aso Mahmudi

Figure 1 for Jira: a Kurdish Speech Recognition System Designing and Building Speech Corpus and Pronunciation Lexicon
Figure 2 for Jira: a Kurdish Speech Recognition System Designing and Building Speech Corpus and Pronunciation Lexicon
Figure 3 for Jira: a Kurdish Speech Recognition System Designing and Building Speech Corpus and Pronunciation Lexicon
Figure 4 for Jira: a Kurdish Speech Recognition System Designing and Building Speech Corpus and Pronunciation Lexicon
Viaarxiv icon

Transfer Learning from Audio-Visual Grounding to Speech Recognition

Jul 09, 2019
Wei-Ning Hsu, David Harwath, James Glass

Figure 1 for Transfer Learning from Audio-Visual Grounding to Speech Recognition
Figure 2 for Transfer Learning from Audio-Visual Grounding to Speech Recognition
Figure 3 for Transfer Learning from Audio-Visual Grounding to Speech Recognition
Figure 4 for Transfer Learning from Audio-Visual Grounding to Speech Recognition
Viaarxiv icon

Bridging Speech and Textual Pre-trained Models with Unsupervised ASR

Add code
Bookmark button
Alert button
Nov 06, 2022
Jiatong Shi, Chan-Jan Hsu, Holam Chung, Dongji Gao, Paola Garcia, Shinji Watanabe, Ann Lee, Hung-yi Lee

Figure 1 for Bridging Speech and Textual Pre-trained Models with Unsupervised ASR
Figure 2 for Bridging Speech and Textual Pre-trained Models with Unsupervised ASR
Figure 3 for Bridging Speech and Textual Pre-trained Models with Unsupervised ASR
Figure 4 for Bridging Speech and Textual Pre-trained Models with Unsupervised ASR
Viaarxiv icon

Efficient Use of Large Pre-Trained Models for Low Resource ASR

Add code
Bookmark button
Alert button
Oct 26, 2022
Peter Vieting, Christoph Lüscher, Julian Dierkes, Ralf Schlüter, Hermann Ney

Figure 1 for Efficient Use of Large Pre-Trained Models for Low Resource ASR
Figure 2 for Efficient Use of Large Pre-Trained Models for Low Resource ASR
Figure 3 for Efficient Use of Large Pre-Trained Models for Low Resource ASR
Figure 4 for Efficient Use of Large Pre-Trained Models for Low Resource ASR
Viaarxiv icon

Speech Aware Dialog System Technology Challenge (DSTC11)

Dec 16, 2022
Hagen Soltau, Izhak Shafran, Mingqiu Wang, Abhinav Rastogi, Jeffrey Zhao, Ye Jia, Wei Han, Yuan Cao, Aramys Miranda

Figure 1 for Speech Aware Dialog System Technology Challenge (DSTC11)
Figure 2 for Speech Aware Dialog System Technology Challenge (DSTC11)
Figure 3 for Speech Aware Dialog System Technology Challenge (DSTC11)
Figure 4 for Speech Aware Dialog System Technology Challenge (DSTC11)
Viaarxiv icon

LiteLSTM Architecture Based on Weights Sharing for Recurrent Neural Networks

Jan 12, 2023
Nelly Elsayed, Zag ElSayed, Anthony S. Maida

Figure 1 for LiteLSTM Architecture Based on Weights Sharing for Recurrent Neural Networks
Figure 2 for LiteLSTM Architecture Based on Weights Sharing for Recurrent Neural Networks
Figure 3 for LiteLSTM Architecture Based on Weights Sharing for Recurrent Neural Networks
Figure 4 for LiteLSTM Architecture Based on Weights Sharing for Recurrent Neural Networks
Viaarxiv icon

Speaker-Targeted Audio-Visual Models for Speech Recognition in Cocktail-Party Environments

Jun 13, 2019
Guan-Lin Chao, William Chan, Ian Lane

Figure 1 for Speaker-Targeted Audio-Visual Models for Speech Recognition in Cocktail-Party Environments
Figure 2 for Speaker-Targeted Audio-Visual Models for Speech Recognition in Cocktail-Party Environments
Figure 3 for Speaker-Targeted Audio-Visual Models for Speech Recognition in Cocktail-Party Environments
Figure 4 for Speaker-Targeted Audio-Visual Models for Speech Recognition in Cocktail-Party Environments
Viaarxiv icon

End-to-End Label Uncertainty Modeling in Speech Emotion Recognition using Bayesian Neural Networks and Label Distribution Learning

Add code
Bookmark button
Alert button
Sep 30, 2022
Navin Raj Prabhu, Nale Lehmann-Willenbrock, Timo Gerkman

Figure 1 for End-to-End Label Uncertainty Modeling in Speech Emotion Recognition using Bayesian Neural Networks and Label Distribution Learning
Figure 2 for End-to-End Label Uncertainty Modeling in Speech Emotion Recognition using Bayesian Neural Networks and Label Distribution Learning
Figure 3 for End-to-End Label Uncertainty Modeling in Speech Emotion Recognition using Bayesian Neural Networks and Label Distribution Learning
Figure 4 for End-to-End Label Uncertainty Modeling in Speech Emotion Recognition using Bayesian Neural Networks and Label Distribution Learning
Viaarxiv icon