Alert button

"speech recognition": models, code, and papers
Alert button

Predicting positive transfer for improved low-resource speech recognition using acoustic pseudo-tokens

Feb 03, 2024
Nay San, Georgios Paraskevopoulos, Aryaman Arora, Xiluo He, Prabhjot Kaur, Oliver Adams, Dan Jurafsky

Viaarxiv icon

Exploring the limits of decoder-only models trained on public speech recognition corpora

Jan 31, 2024
Ankit Gupta, George Saon, Brian Kingsbury

Viaarxiv icon

Two-pass Endpoint Detection for Speech Recognition

Jan 17, 2024
Anirudh Raju, Aparna Khare, Di He, Ilya Sklyar, Long Chen, Sam Alptekin, Viet Anh Trinh, Zhe Zhang, Colin Vaz, Venkatesh Ravichandran, Roland Maas, Ariya Rastrow

Viaarxiv icon

ArEEG_Chars: Dataset for Envisioned Speech Recognition using EEG for Arabic Characters

Feb 24, 2024
Hazem Darwish, Abdalrahman Al Malah, Khloud Al Jallad, Nada Ghneim

Viaarxiv icon

Efficient data selection employing Semantic Similarity-based Graph Structures for model training

Feb 22, 2024
Roxana Petcu, Subhadeep Maji

Viaarxiv icon

ICMC-ASR: The ICASSP 2024 In-Car Multi-Channel Automatic Speech Recognition Challenge

Jan 07, 2024
He Wang, Pengcheng Guo, Yue Li, Ao Zhang, Jiayao Sun, Lei Xie, Wei Chen, Pan Zhou, Hui Bu, Xin Xu, Binbin Zhang, Zhuo Chen, Jian Wu, Longbiao Wang, Eng Siong Chng, Sun Li

Viaarxiv icon

Extreme Encoder Output Frame Rate Reduction: Improving Computational Latencies of Large End-to-End Models

Feb 27, 2024
Rohit Prabhavalkar, Zhong Meng, Weiran Wang, Adam Stooke, Xingyu Cai, Yanzhang He, Arun Narayanan, Dongseong Hwang, Tara N. Sainath, Pedro J. Moreno

Viaarxiv icon

CochCeps-Augment: A Novel Self-Supervised Contrastive Learning Using Cochlear Cepstrum-based Masking for Speech Emotion Recognition

Feb 10, 2024
Ioannis Ziogas, Hessa Alfalahi, Ahsan H. Khandoker, Leontios J. Hadjileontiadis

Viaarxiv icon

Multimodal Emotion Recognition from Raw Audio with Sinc-convolution

Feb 19, 2024
Xiaohui Zhang, Wenjie Fu, Mangui Liang

Viaarxiv icon

Byte Pair Encoding Is All You Need For Automatic Bengali Speech Recognition

Jan 28, 2024
Ahnaf Mozib Samin

Viaarxiv icon