Alert button

"speech recognition": models, code, and papers
Alert button

XLS-R Deep Learning Model for Multilingual ASR on Low- Resource Languages: Indonesian, Javanese, and Sundanese

Jan 12, 2024
Panji Arisaputra, Alif Tri Handoyo, Amalia Zahra

Viaarxiv icon

DSNet: Disentangled Siamese Network with Neutral Calibration for Speech Emotion Recognition

Dec 25, 2023
Chengxin Chen, Pengyuan Zhang

Viaarxiv icon

Towards Automatic Data Augmentation for Disordered Speech Recognition

Dec 14, 2023
Zengrui Jin, Xurong Xie, Tianzi Wang, Mengzhe Geng, Jiajun Deng, Guinan Li, Shujie Hu, Xunying Liu

Viaarxiv icon

BS-PLCNet: Band-split Packet Loss Concealment Network with Multi-task Learning Framework and Multi-discriminators

Jan 08, 2024
Zihan Zhang, Jiayao Sun, Xianjun Xia, Chuanzeng Huang, Yijian Xiao, Lei Xie

Viaarxiv icon

Punctuation Restoration Improves Structure Understanding without Supervision

Feb 13, 2024
Junghyun Min, Minho Lee, Woochul Lee, Yeonsoo Lee

Viaarxiv icon

Enhancing Pre-trained ASR System Fine-tuning for Dysarthric Speech Recognition using Adversarial Data Augmentation

Jan 01, 2024
Huimeng Wang, Zengrui Jin, Mengzhe Geng, Shujie Hu, Guinan Li, Tianzi Wang, Haoning Xu, Xunying Liu

Viaarxiv icon

Dementia Assessment Using Mandarin Speech with an Attention-based Speech Recognition Encoder

Add code
Bookmark button
Alert button
Oct 06, 2023
Zih-Jyun Lin, Yi-Ju Chen, Po-Chih Kuo, Likai Huang, Chaur-Jong Hu, Cheng-Yu Chen

Figure 1 for Dementia Assessment Using Mandarin Speech with an Attention-based Speech Recognition Encoder
Figure 2 for Dementia Assessment Using Mandarin Speech with an Attention-based Speech Recognition Encoder
Figure 3 for Dementia Assessment Using Mandarin Speech with an Attention-based Speech Recognition Encoder
Figure 4 for Dementia Assessment Using Mandarin Speech with an Attention-based Speech Recognition Encoder
Viaarxiv icon

Analysis of Self-Supervised Speech Models on Children's Speech and Infant Vocalizations

Feb 10, 2024
Jialu Li, Mark Hasegawa-Johnson, Nancy L. McElwain

Viaarxiv icon

Pseudo-Labeling for Domain-Agnostic Bangla Automatic Speech Recognition

Add code
Bookmark button
Alert button
Nov 06, 2023
Rabindra Nath Nandi, Mehadi Hasan Menon, Tareq Al Muntasir, Sagor Sarker, Quazi Sarwar Muhtaseem, Md. Tariqul Islam, Shammur Absar Chowdhury, Firoj Alam

Viaarxiv icon

Multi-Input Multi-Output Target-Speaker Voice Activity Detection For Unified, Flexible, and Robust Audio-Visual Speaker Diarization

Jan 16, 2024
Ming Cheng, Ming Li

Viaarxiv icon