Alert button

"speech recognition": models, code, and papers
Alert button

Transfer Learning for Robust Low-Resource Children's Speech ASR with Transformers and Source-Filter Warping

Jun 19, 2022
Jenthe Thienpondt, Kris Demuynck

Figure 1 for Transfer Learning for Robust Low-Resource Children's Speech ASR with Transformers and Source-Filter Warping
Figure 2 for Transfer Learning for Robust Low-Resource Children's Speech ASR with Transformers and Source-Filter Warping
Figure 3 for Transfer Learning for Robust Low-Resource Children's Speech ASR with Transformers and Source-Filter Warping
Figure 4 for Transfer Learning for Robust Low-Resource Children's Speech ASR with Transformers and Source-Filter Warping
Viaarxiv icon

Advances and Challenges in Deep Lip Reading

Add code
Bookmark button
Alert button
Oct 15, 2021
Marzieh Oghbaie, Arian Sabaghi, Kooshan Hashemifard, Mohammad Akbari

Figure 1 for Advances and Challenges in Deep Lip Reading
Figure 2 for Advances and Challenges in Deep Lip Reading
Figure 3 for Advances and Challenges in Deep Lip Reading
Figure 4 for Advances and Challenges in Deep Lip Reading
Viaarxiv icon

Neural Speech Recognizer: Acoustic-to-Word LSTM Model for Large Vocabulary Speech Recognition

Oct 31, 2016
Hagen Soltau, Hank Liao, Hasim Sak

Figure 1 for Neural Speech Recognizer: Acoustic-to-Word LSTM Model for Large Vocabulary Speech Recognition
Figure 2 for Neural Speech Recognizer: Acoustic-to-Word LSTM Model for Large Vocabulary Speech Recognition
Figure 3 for Neural Speech Recognizer: Acoustic-to-Word LSTM Model for Large Vocabulary Speech Recognition
Figure 4 for Neural Speech Recognizer: Acoustic-to-Word LSTM Model for Large Vocabulary Speech Recognition
Viaarxiv icon

Bimodal Speech Emotion Recognition Using Pre-Trained Language Models

Nov 29, 2019
Verena Heusser, Niklas Freymuth, Stefan Constantin, Alex Waibel

Figure 1 for Bimodal Speech Emotion Recognition Using Pre-Trained Language Models
Figure 2 for Bimodal Speech Emotion Recognition Using Pre-Trained Language Models
Figure 3 for Bimodal Speech Emotion Recognition Using Pre-Trained Language Models
Figure 4 for Bimodal Speech Emotion Recognition Using Pre-Trained Language Models
Viaarxiv icon

Should we hard-code the recurrence concept or learn it instead ? Exploring the Transformer architecture for Audio-Visual Speech Recognition

Add code
Bookmark button
Alert button
May 19, 2020
George Sterpu, Christian Saam, Naomi Harte

Figure 1 for Should we hard-code the recurrence concept or learn it instead ? Exploring the Transformer architecture for Audio-Visual Speech Recognition
Figure 2 for Should we hard-code the recurrence concept or learn it instead ? Exploring the Transformer architecture for Audio-Visual Speech Recognition
Figure 3 for Should we hard-code the recurrence concept or learn it instead ? Exploring the Transformer architecture for Audio-Visual Speech Recognition
Figure 4 for Should we hard-code the recurrence concept or learn it instead ? Exploring the Transformer architecture for Audio-Visual Speech Recognition
Viaarxiv icon

AutoDiCE: Fully Automated Distributed CNN Inference at the Edge

Jul 20, 2022
Xiaotian Guo, Andy D. Pimentel, Todor Stefanov

Figure 1 for AutoDiCE: Fully Automated Distributed CNN Inference at the Edge
Figure 2 for AutoDiCE: Fully Automated Distributed CNN Inference at the Edge
Figure 3 for AutoDiCE: Fully Automated Distributed CNN Inference at the Edge
Figure 4 for AutoDiCE: Fully Automated Distributed CNN Inference at the Edge
Viaarxiv icon

Unsupervised Cross-Lingual Speech Emotion Recognition Using Pseudo Multilabel

Add code
Bookmark button
Alert button
Aug 19, 2021
Jin Li, Nan Yan, Lan Wang

Figure 1 for Unsupervised Cross-Lingual Speech Emotion Recognition Using Pseudo Multilabel
Figure 2 for Unsupervised Cross-Lingual Speech Emotion Recognition Using Pseudo Multilabel
Figure 3 for Unsupervised Cross-Lingual Speech Emotion Recognition Using Pseudo Multilabel
Figure 4 for Unsupervised Cross-Lingual Speech Emotion Recognition Using Pseudo Multilabel
Viaarxiv icon

Decoupled Federated Learning for ASR with Non-IID Data

Jun 18, 2022
Han Zhu, Jindong Wang, Gaofeng Cheng, Pengyuan Zhang, Yonghong Yan

Figure 1 for Decoupled Federated Learning for ASR with Non-IID Data
Figure 2 for Decoupled Federated Learning for ASR with Non-IID Data
Figure 3 for Decoupled Federated Learning for ASR with Non-IID Data
Figure 4 for Decoupled Federated Learning for ASR with Non-IID Data
Viaarxiv icon

Deep Neural Mel-Subband Beamformer for In-car Speech Separation

Add code
Bookmark button
Alert button
Nov 22, 2022
Vinay Kothapally, Yong Xu, Meng Yu, Shi-Xiong Zhang, Dong Yu

Figure 1 for Deep Neural Mel-Subband Beamformer for In-car Speech Separation
Figure 2 for Deep Neural Mel-Subband Beamformer for In-car Speech Separation
Figure 3 for Deep Neural Mel-Subband Beamformer for In-car Speech Separation
Viaarxiv icon

Clinical Dialogue Transcription Error Correction using Seq2Seq Models

May 26, 2022
Gayani Nanayakkara, Nirmalie Wiratunga, David Corsar, Kyle Martin, Anjana Wijekoon

Figure 1 for Clinical Dialogue Transcription Error Correction using Seq2Seq Models
Figure 2 for Clinical Dialogue Transcription Error Correction using Seq2Seq Models
Figure 3 for Clinical Dialogue Transcription Error Correction using Seq2Seq Models
Figure 4 for Clinical Dialogue Transcription Error Correction using Seq2Seq Models
Viaarxiv icon