Alert button

"speech recognition": models, code, and papers
Alert button

Effect of different splitting criteria on the performance of speech emotion recognition

Oct 26, 2022
Bagus Tris Atmaja, Akira Sasou

Figure 1 for Effect of different splitting criteria on the performance of speech emotion recognition
Figure 2 for Effect of different splitting criteria on the performance of speech emotion recognition
Figure 3 for Effect of different splitting criteria on the performance of speech emotion recognition
Figure 4 for Effect of different splitting criteria on the performance of speech emotion recognition
Viaarxiv icon

Improving CTC-based speech recognition via knowledge transferring from pre-trained language models

Feb 22, 2022
Keqi Deng, Songjun Cao, Yike Zhang, Long Ma, Gaofeng Cheng, Ji Xu, Pengyuan Zhang

Figure 1 for Improving CTC-based speech recognition via knowledge transferring from pre-trained language models
Figure 2 for Improving CTC-based speech recognition via knowledge transferring from pre-trained language models
Figure 3 for Improving CTC-based speech recognition via knowledge transferring from pre-trained language models
Figure 4 for Improving CTC-based speech recognition via knowledge transferring from pre-trained language models
Viaarxiv icon

Stutter-TTS: Controlled Synthesis and Improved Recognition of Stuttered Speech

Nov 04, 2022
Xin Zhang, Iván Vallés-Pérez, Andreas Stolcke, Chengzhu Yu, Jasha Droppo, Olabanji Shonibare, Roberto Barra-Chicote, Venkatesh Ravichandran

Figure 1 for Stutter-TTS: Controlled Synthesis and Improved Recognition of Stuttered Speech
Figure 2 for Stutter-TTS: Controlled Synthesis and Improved Recognition of Stuttered Speech
Figure 3 for Stutter-TTS: Controlled Synthesis and Improved Recognition of Stuttered Speech
Figure 4 for Stutter-TTS: Controlled Synthesis and Improved Recognition of Stuttered Speech
Viaarxiv icon

Self-Supervised Learning for speech recognition with Intermediate layer supervision

Dec 16, 2021
Chengyi Wang, Yu Wu, Sanyuan Chen, Shujie Liu, Jinyu Li, Yao Qian, Zhenglu Yang

Figure 1 for Self-Supervised Learning for speech recognition with Intermediate layer supervision
Figure 2 for Self-Supervised Learning for speech recognition with Intermediate layer supervision
Figure 3 for Self-Supervised Learning for speech recognition with Intermediate layer supervision
Figure 4 for Self-Supervised Learning for speech recognition with Intermediate layer supervision
Viaarxiv icon

Variational Auto-Encoder Based Variability Encoding for Dysarthric Speech Recognition

Jan 24, 2022
Xurong Xie, Rukiye Ruzi, Xunying Liu, Lan Wang

Figure 1 for Variational Auto-Encoder Based Variability Encoding for Dysarthric Speech Recognition
Figure 2 for Variational Auto-Encoder Based Variability Encoding for Dysarthric Speech Recognition
Figure 3 for Variational Auto-Encoder Based Variability Encoding for Dysarthric Speech Recognition
Figure 4 for Variational Auto-Encoder Based Variability Encoding for Dysarthric Speech Recognition
Viaarxiv icon

Efficient conformer-based speech recognition with linear attention

Apr 14, 2021
Shengqiang Li, Menglong Xu, Xiao-Lei Zhang

Figure 1 for Efficient conformer-based speech recognition with linear attention
Figure 2 for Efficient conformer-based speech recognition with linear attention
Figure 3 for Efficient conformer-based speech recognition with linear attention
Figure 4 for Efficient conformer-based speech recognition with linear attention
Viaarxiv icon

Array Configuration-Agnostic Personal Voice Activity Detection Based on Spatial Coherence

Apr 18, 2023
Yicheng Hsu, Mingsian R. Bai

Figure 1 for Array Configuration-Agnostic Personal Voice Activity Detection Based on Spatial Coherence
Figure 2 for Array Configuration-Agnostic Personal Voice Activity Detection Based on Spatial Coherence
Figure 3 for Array Configuration-Agnostic Personal Voice Activity Detection Based on Spatial Coherence
Figure 4 for Array Configuration-Agnostic Personal Voice Activity Detection Based on Spatial Coherence
Viaarxiv icon

The Accented English Speech Recognition Challenge 2020: Open Datasets, Tracks, Baselines, Results and Methods

Feb 20, 2021
Xian Shi, Fan Yu, Yizhou Lu, Yuhao Liang, Qiangze Feng, Daliang Wang, Yanmin Qian, Lei Xie

Figure 1 for The Accented English Speech Recognition Challenge 2020: Open Datasets, Tracks, Baselines, Results and Methods
Figure 2 for The Accented English Speech Recognition Challenge 2020: Open Datasets, Tracks, Baselines, Results and Methods
Figure 3 for The Accented English Speech Recognition Challenge 2020: Open Datasets, Tracks, Baselines, Results and Methods
Viaarxiv icon

Synthesizing Dysarthric Speech Using Multi-talker TTS for Dysarthric Speech Recognition

Jan 27, 2022
Mohammad Soleymanpour, Michael T. Johnson, Rahim Soleymanpour, Jeffrey Berry

Figure 1 for Synthesizing Dysarthric Speech Using Multi-talker TTS for Dysarthric Speech Recognition
Figure 2 for Synthesizing Dysarthric Speech Using Multi-talker TTS for Dysarthric Speech Recognition
Figure 3 for Synthesizing Dysarthric Speech Using Multi-talker TTS for Dysarthric Speech Recognition
Figure 4 for Synthesizing Dysarthric Speech Using Multi-talker TTS for Dysarthric Speech Recognition
Viaarxiv icon

Temporal Modeling Matters: A Novel Temporal Emotional Modeling Approach for Speech Emotion Recognition

Nov 14, 2022
Jiaxin Ye, Xincheng Wen, Yujie Wei, Yong Xu, Kunhong Liu, Hongming Shan

Figure 1 for Temporal Modeling Matters: A Novel Temporal Emotional Modeling Approach for Speech Emotion Recognition
Figure 2 for Temporal Modeling Matters: A Novel Temporal Emotional Modeling Approach for Speech Emotion Recognition
Figure 3 for Temporal Modeling Matters: A Novel Temporal Emotional Modeling Approach for Speech Emotion Recognition
Figure 4 for Temporal Modeling Matters: A Novel Temporal Emotional Modeling Approach for Speech Emotion Recognition
Viaarxiv icon