Alert button

"speech recognition": models, code, and papers
Alert button

Simple and Effective Zero-shot Cross-lingual Phoneme Recognition

Add code
Bookmark button
Alert button
Sep 23, 2021
Qiantong Xu, Alexei Baevski, Michael Auli

Figure 1 for Simple and Effective Zero-shot Cross-lingual Phoneme Recognition
Figure 2 for Simple and Effective Zero-shot Cross-lingual Phoneme Recognition
Figure 3 for Simple and Effective Zero-shot Cross-lingual Phoneme Recognition
Figure 4 for Simple and Effective Zero-shot Cross-lingual Phoneme Recognition
Viaarxiv icon

Domain Prompts: Towards memory and compute efficient domain adaptation of ASR systems

Dec 16, 2021
Saket Dingliwal, Ashish Shenoy, Sravan Bodapati, Ankur Gandhe, Ravi Teja Gadde, Katrin Kirchhoff

Figure 1 for Domain Prompts: Towards memory and compute efficient domain adaptation of ASR systems
Figure 2 for Domain Prompts: Towards memory and compute efficient domain adaptation of ASR systems
Figure 3 for Domain Prompts: Towards memory and compute efficient domain adaptation of ASR systems
Figure 4 for Domain Prompts: Towards memory and compute efficient domain adaptation of ASR systems
Viaarxiv icon

Predicting Affective Vocal Bursts with Finetuned wav2vec 2.0

Add code
Bookmark button
Alert button
Sep 27, 2022
Bagus Tris Atmaja, Akira Sasou

Figure 1 for Predicting Affective Vocal Bursts with Finetuned wav2vec 2.0
Figure 2 for Predicting Affective Vocal Bursts with Finetuned wav2vec 2.0
Figure 3 for Predicting Affective Vocal Bursts with Finetuned wav2vec 2.0
Figure 4 for Predicting Affective Vocal Bursts with Finetuned wav2vec 2.0
Viaarxiv icon

Multilingual training set selection for ASR in under-resourced Malian languages

Aug 13, 2021
Ewald van der Westhuizen, Trideba Padhi, Thomas Niesler

Figure 1 for Multilingual training set selection for ASR in under-resourced Malian languages
Figure 2 for Multilingual training set selection for ASR in under-resourced Malian languages
Figure 3 for Multilingual training set selection for ASR in under-resourced Malian languages
Figure 4 for Multilingual training set selection for ASR in under-resourced Malian languages
Viaarxiv icon

Run-and-back stitch search: novel block synchronous decoding for streaming encoder-decoder ASR

Jan 25, 2022
Emiru Tsunoo, Chaitanya Narisetty, Michael Hentschel, Yosuke Kashiwagi, Shinji Watanabe

Figure 1 for Run-and-back stitch search: novel block synchronous decoding for streaming encoder-decoder ASR
Figure 2 for Run-and-back stitch search: novel block synchronous decoding for streaming encoder-decoder ASR
Figure 3 for Run-and-back stitch search: novel block synchronous decoding for streaming encoder-decoder ASR
Figure 4 for Run-and-back stitch search: novel block synchronous decoding for streaming encoder-decoder ASR
Viaarxiv icon

A Convolutional Neural Network Based Approach to Recognize Bangla Spoken Digits from Speech Signal

Nov 12, 2021
Ovishake Sen, Al-Mahmud, Pias Roy

Figure 1 for A Convolutional Neural Network Based Approach to Recognize Bangla Spoken Digits from Speech Signal
Figure 2 for A Convolutional Neural Network Based Approach to Recognize Bangla Spoken Digits from Speech Signal
Figure 3 for A Convolutional Neural Network Based Approach to Recognize Bangla Spoken Digits from Speech Signal
Figure 4 for A Convolutional Neural Network Based Approach to Recognize Bangla Spoken Digits from Speech Signal
Viaarxiv icon

End-to-End Speaker-Attributed ASR with Transformer

Add code
Bookmark button
Alert button
Apr 05, 2021
Naoyuki Kanda, Guoli Ye, Yashesh Gaur, Xiaofei Wang, Zhong Meng, Zhuo Chen, Takuya Yoshioka

Figure 1 for End-to-End Speaker-Attributed ASR with Transformer
Figure 2 for End-to-End Speaker-Attributed ASR with Transformer
Figure 3 for End-to-End Speaker-Attributed ASR with Transformer
Figure 4 for End-to-End Speaker-Attributed ASR with Transformer
Viaarxiv icon

Speech Recognition with Deep Recurrent Neural Networks

Add code
Bookmark button
Alert button
Mar 22, 2013
Alex Graves, Abdel-rahman Mohamed, Geoffrey Hinton

Figure 1 for Speech Recognition with Deep Recurrent Neural Networks
Figure 2 for Speech Recognition with Deep Recurrent Neural Networks
Viaarxiv icon

Jointly Fine-Tuning "BERT-like" Self Supervised Models to Improve Multimodal Speech Emotion Recognition

Add code
Bookmark button
Alert button
Aug 15, 2020
Shamane Siriwardhana, Andrew Reis, Rivindu Weerasekera, Suranga Nanayakkara

Figure 1 for Jointly Fine-Tuning "BERT-like" Self Supervised Models to Improve Multimodal Speech Emotion Recognition
Figure 2 for Jointly Fine-Tuning "BERT-like" Self Supervised Models to Improve Multimodal Speech Emotion Recognition
Figure 3 for Jointly Fine-Tuning "BERT-like" Self Supervised Models to Improve Multimodal Speech Emotion Recognition
Viaarxiv icon

PickNet: Real-Time Channel Selection for Ad Hoc Microphone Arrays

Jan 24, 2022
Takuya Yoshioka, Xiaofei Wang, Dongmei Wang

Figure 1 for PickNet: Real-Time Channel Selection for Ad Hoc Microphone Arrays
Figure 2 for PickNet: Real-Time Channel Selection for Ad Hoc Microphone Arrays
Figure 3 for PickNet: Real-Time Channel Selection for Ad Hoc Microphone Arrays
Figure 4 for PickNet: Real-Time Channel Selection for Ad Hoc Microphone Arrays
Viaarxiv icon