Alert button

"speech recognition": models, code, and papers
Alert button

PM-MMUT: Boosted Phone-mask Data Augmentation using Multi-modeing Unit Training for Robust Uyghur E2E Speech Recognition

Add code
Bookmark button
Alert button
Dec 13, 2021
Guodong Ma, Pengfei Hu, Nurmemet Yolwas, Shen Huang, Hao Huang

Figure 1 for PM-MMUT: Boosted Phone-mask Data Augmentation using Multi-modeing Unit Training for Robust Uyghur E2E Speech Recognition
Figure 2 for PM-MMUT: Boosted Phone-mask Data Augmentation using Multi-modeing Unit Training for Robust Uyghur E2E Speech Recognition
Figure 3 for PM-MMUT: Boosted Phone-mask Data Augmentation using Multi-modeing Unit Training for Robust Uyghur E2E Speech Recognition
Figure 4 for PM-MMUT: Boosted Phone-mask Data Augmentation using Multi-modeing Unit Training for Robust Uyghur E2E Speech Recognition
Viaarxiv icon

Pre-training in Deep Reinforcement Learning for Automatic Speech Recognition

Oct 24, 2019
Thejan Rajapakshe, Rajib Rana, Siddique Latif, Sara Khalifa, Björn W. Schuller

Figure 1 for Pre-training in Deep Reinforcement Learning for Automatic Speech Recognition
Figure 2 for Pre-training in Deep Reinforcement Learning for Automatic Speech Recognition
Figure 3 for Pre-training in Deep Reinforcement Learning for Automatic Speech Recognition
Figure 4 for Pre-training in Deep Reinforcement Learning for Automatic Speech Recognition
Viaarxiv icon

G-Augment: Searching For The Meta-Structure Of Data Augmentation Policies For ASR

Oct 19, 2022
Gary Wang, Ekin D. Cubuk, Andrew Rosenberg, Shuyang Cheng, Ron J. Weiss, Bhuvana Ramabhadran, Pedro J. Moreno, Quoc V. Le, Daniel S. Park

Figure 1 for G-Augment: Searching For The Meta-Structure Of Data Augmentation Policies For ASR
Figure 2 for G-Augment: Searching For The Meta-Structure Of Data Augmentation Policies For ASR
Figure 3 for G-Augment: Searching For The Meta-Structure Of Data Augmentation Policies For ASR
Figure 4 for G-Augment: Searching For The Meta-Structure Of Data Augmentation Policies For ASR
Viaarxiv icon

Transformer-based language modeling and decoding for conversational speech recognition

Jan 04, 2020
Kareem Nassar

Figure 1 for Transformer-based language modeling and decoding for conversational speech recognition
Figure 2 for Transformer-based language modeling and decoding for conversational speech recognition
Figure 3 for Transformer-based language modeling and decoding for conversational speech recognition
Figure 4 for Transformer-based language modeling and decoding for conversational speech recognition
Viaarxiv icon

MVNet: Memory Assistance and Vocal Reinforcement Network for Speech Enhancement

Sep 15, 2022
Jianrong Wang, Xiaomin Li, Xuewei Li, Mei Yu, Qiang Fang, Li Liu

Figure 1 for MVNet: Memory Assistance and Vocal Reinforcement Network for Speech Enhancement
Figure 2 for MVNet: Memory Assistance and Vocal Reinforcement Network for Speech Enhancement
Figure 3 for MVNet: Memory Assistance and Vocal Reinforcement Network for Speech Enhancement
Figure 4 for MVNet: Memory Assistance and Vocal Reinforcement Network for Speech Enhancement
Viaarxiv icon

Efficient spike encoding algorithms for neuromorphic speech recognition

Add code
Bookmark button
Alert button
Jul 14, 2022
Sidi Yaya Arnaud Yarga, Jean Rouat, Sean U. N. Wood

Figure 1 for Efficient spike encoding algorithms for neuromorphic speech recognition
Figure 2 for Efficient spike encoding algorithms for neuromorphic speech recognition
Figure 3 for Efficient spike encoding algorithms for neuromorphic speech recognition
Figure 4 for Efficient spike encoding algorithms for neuromorphic speech recognition
Viaarxiv icon

Foundation Transformers

Add code
Bookmark button
Alert button
Oct 12, 2022
Hongyu Wang, Shuming Ma, Shaohan Huang, Li Dong, Wenhui Wang, Zhiliang Peng, Yu Wu, Payal Bajaj, Saksham Singhal, Alon Benhaim, Barun Patra, Zhun Liu, Vishrav Chaudhary, Xia Song, Furu Wei

Figure 1 for Foundation Transformers
Figure 2 for Foundation Transformers
Figure 3 for Foundation Transformers
Figure 4 for Foundation Transformers
Viaarxiv icon

Multitask Learning from Augmented Auxiliary Data for Improving Speech Emotion Recognition

Add code
Bookmark button
Alert button
Jul 12, 2022
Siddique Latif, Rajib Rana, Sara Khalifa, Raja Jurdak, Björn W. Schuller

Figure 1 for Multitask Learning from Augmented Auxiliary Data for Improving Speech Emotion Recognition
Figure 2 for Multitask Learning from Augmented Auxiliary Data for Improving Speech Emotion Recognition
Figure 3 for Multitask Learning from Augmented Auxiliary Data for Improving Speech Emotion Recognition
Figure 4 for Multitask Learning from Augmented Auxiliary Data for Improving Speech Emotion Recognition
Viaarxiv icon

Using heterogeneity in semi-supervised transcription hypotheses to improve code-switched speech recognition

Jun 14, 2021
Andrew Slottje, Shannon Wotherspoon, William Hartmann, Matthew Snover, Owen Kimball

Figure 1 for Using heterogeneity in semi-supervised transcription hypotheses to improve code-switched speech recognition
Figure 2 for Using heterogeneity in semi-supervised transcription hypotheses to improve code-switched speech recognition
Figure 3 for Using heterogeneity in semi-supervised transcription hypotheses to improve code-switched speech recognition
Figure 4 for Using heterogeneity in semi-supervised transcription hypotheses to improve code-switched speech recognition
Viaarxiv icon

Phoneme Based Neural Transducer for Large Vocabulary Speech Recognition

Oct 30, 2020
Wei Zhou, Simon Berger, Ralf Schlüter, Hermann Ney

Figure 1 for Phoneme Based Neural Transducer for Large Vocabulary Speech Recognition
Figure 2 for Phoneme Based Neural Transducer for Large Vocabulary Speech Recognition
Figure 3 for Phoneme Based Neural Transducer for Large Vocabulary Speech Recognition
Figure 4 for Phoneme Based Neural Transducer for Large Vocabulary Speech Recognition
Viaarxiv icon