Alert button

"speech recognition": models, code, and papers
Alert button

Personalized Adversarial Data Augmentation for Dysarthric and Elderly Speech Recognition

May 17, 2022
Zengrui Jin, Mengzhe Geng, Jiajun Deng, Tianzi Wang, Shujie Hu, Guinan Li, Xunying Liu

Figure 1 for Personalized Adversarial Data Augmentation for Dysarthric and Elderly Speech Recognition
Figure 2 for Personalized Adversarial Data Augmentation for Dysarthric and Elderly Speech Recognition
Figure 3 for Personalized Adversarial Data Augmentation for Dysarthric and Elderly Speech Recognition
Figure 4 for Personalized Adversarial Data Augmentation for Dysarthric and Elderly Speech Recognition
Viaarxiv icon

Residual Language Model for End-to-end Speech Recognition

Jun 15, 2022
Emiru Tsunoo, Yosuke Kashiwagi, Chaitanya Narisetty, Shinji Watanabe

Figure 1 for Residual Language Model for End-to-end Speech Recognition
Figure 2 for Residual Language Model for End-to-end Speech Recognition
Viaarxiv icon

Disappeared Command: Spoofing Attack On Automatic Speech Recognition Systems with Sound Masking

Apr 19, 2022
Jinghui Xu, Jiangshan Zhang, Jifeng Zhu, Yong Yang

Figure 1 for Disappeared Command: Spoofing Attack On Automatic Speech Recognition Systems with Sound Masking
Figure 2 for Disappeared Command: Spoofing Attack On Automatic Speech Recognition Systems with Sound Masking
Figure 3 for Disappeared Command: Spoofing Attack On Automatic Speech Recognition Systems with Sound Masking
Figure 4 for Disappeared Command: Spoofing Attack On Automatic Speech Recognition Systems with Sound Masking
Viaarxiv icon

Towards Better Domain Adaptation for Self-supervised Models: A Case Study of Child ASR

Apr 28, 2023
Ruchao Fan, Yunzheng Zhu, Jinhan Wang, Abeer Alwan

Figure 1 for Towards Better Domain Adaptation for Self-supervised Models: A Case Study of Child ASR
Figure 2 for Towards Better Domain Adaptation for Self-supervised Models: A Case Study of Child ASR
Figure 3 for Towards Better Domain Adaptation for Self-supervised Models: A Case Study of Child ASR
Figure 4 for Towards Better Domain Adaptation for Self-supervised Models: A Case Study of Child ASR
Viaarxiv icon

Exploiting Cross Domain Acoustic-to-articulatory Inverted Features For Disordered Speech Recognition

Mar 19, 2022
Shujie Hu, Shansong Liu, Xurong Xie, Mengzhe Geng, Tianzi Wang, Shoukang Hu, Mingyu Cui, Xunying Liu, Helen Meng

Figure 1 for Exploiting Cross Domain Acoustic-to-articulatory Inverted Features For Disordered Speech Recognition
Figure 2 for Exploiting Cross Domain Acoustic-to-articulatory Inverted Features For Disordered Speech Recognition
Figure 3 for Exploiting Cross Domain Acoustic-to-articulatory Inverted Features For Disordered Speech Recognition
Figure 4 for Exploiting Cross Domain Acoustic-to-articulatory Inverted Features For Disordered Speech Recognition
Viaarxiv icon

OLKAVS: An Open Large-Scale Korean Audio-Visual Speech Dataset

Jan 16, 2023
Jeongkyun Park, Jung-Wook Hwang, Kwanghee Choi, Seung-Hyun Lee, Jun Hwan Ahn, Rae-Hong Park, Hyung-Min Park

Figure 1 for OLKAVS: An Open Large-Scale Korean Audio-Visual Speech Dataset
Figure 2 for OLKAVS: An Open Large-Scale Korean Audio-Visual Speech Dataset
Figure 3 for OLKAVS: An Open Large-Scale Korean Audio-Visual Speech Dataset
Figure 4 for OLKAVS: An Open Large-Scale Korean Audio-Visual Speech Dataset
Viaarxiv icon

Efficient Sequence Transduction by Jointly Predicting Tokens and Durations

Apr 13, 2023
Hainan Xu, Fei Jia, Somshubra Majumdar, He Huang, Shinji Watanabe, Boris Ginsburg

Figure 1 for Efficient Sequence Transduction by Jointly Predicting Tokens and Durations
Figure 2 for Efficient Sequence Transduction by Jointly Predicting Tokens and Durations
Figure 3 for Efficient Sequence Transduction by Jointly Predicting Tokens and Durations
Figure 4 for Efficient Sequence Transduction by Jointly Predicting Tokens and Durations
Viaarxiv icon

Joint unsupervised and supervised learning for context-aware language identification

Mar 29, 2023
Jinseok Park, Hyung Yong Kim, Jihwan Park, Byeong-Yeol Kim, Shukjae Choi, Yunkyu Lim

Figure 1 for Joint unsupervised and supervised learning for context-aware language identification
Figure 2 for Joint unsupervised and supervised learning for context-aware language identification
Figure 3 for Joint unsupervised and supervised learning for context-aware language identification
Figure 4 for Joint unsupervised and supervised learning for context-aware language identification
Viaarxiv icon

Confidence Score Based Conformer Speaker Adaptation for Speech Recognition

Jun 24, 2022
Jiajun Deng, Xurong Xie, Tianzi Wang, Mingyu Cui, Boyang Xue, Zengrui Jin, Mengzhe Geng, Guinan Li, Xunying Liu, Helen Meng

Figure 1 for Confidence Score Based Conformer Speaker Adaptation for Speech Recognition
Figure 2 for Confidence Score Based Conformer Speaker Adaptation for Speech Recognition
Figure 3 for Confidence Score Based Conformer Speaker Adaptation for Speech Recognition
Figure 4 for Confidence Score Based Conformer Speaker Adaptation for Speech Recognition
Viaarxiv icon