Alert button

"speech recognition": models, code, and papers
Alert button

Byte Pair Encoding Is All You Need For Automatic Bengali Speech Recognition

Jan 28, 2024
Ahnaf Mozib Samin

Viaarxiv icon

SeMaScore : a new evaluation metric for automatic speech recognition tasks

Jan 15, 2024
Zitha Sasindran, Harsha Yelchuri, T. V. Prabhakar

Viaarxiv icon

Toward Practical Automatic Speech Recognition and Post-Processing: a Call for Explainable Error Benchmark Guideline

Jan 26, 2024
Seonmin Koo, Chanjun Park, Jinsung Kim, Jaehyung Seo, Sugyeong Eo, Hyeonseok Moon, Heuiseok Lim

Viaarxiv icon

CochCeps-Augment: A Novel Self-Supervised Contrastive Learning Using Cochlear Cepstrum-based Masking for Speech Emotion Recognition

Add code
Bookmark button
Alert button
Feb 10, 2024
Ioannis Ziogas, Hessa Alfalahi, Ahsan H. Khandoker, Leontios J. Hadjileontiadis

Viaarxiv icon

SA-SOT: Speaker-Aware Serialized Output Training for Multi-Talker ASR

Mar 04, 2024
Zhiyun Fan, Linhao Dong, Jun Zhang, Lu Lu, Zejun Ma

Figure 1 for SA-SOT: Speaker-Aware Serialized Output Training for Multi-Talker ASR
Figure 2 for SA-SOT: Speaker-Aware Serialized Output Training for Multi-Talker ASR
Figure 3 for SA-SOT: Speaker-Aware Serialized Output Training for Multi-Talker ASR
Figure 4 for SA-SOT: Speaker-Aware Serialized Output Training for Multi-Talker ASR
Viaarxiv icon

A unified multichannel far-field speech recognition system: combining neural beamforming with attention based end-to-end model

Jan 05, 2024
Dongdi Zhao, Jianbo Ma, Lu Lu, Jinke Li, Xuan Ji, Lei Zhu, Fuming Fang, Ming Liu, Feijun Jiang

Viaarxiv icon

The NPU-ASLP-LiAuto System Description for Visual Speech Recognition in CNVSRC 2023

Jan 07, 2024
He Wang, Pengcheng Guo, Wei Chen, Pan Zhou, Lei Xie

Figure 1 for The NPU-ASLP-LiAuto System Description for Visual Speech Recognition in CNVSRC 2023
Figure 2 for The NPU-ASLP-LiAuto System Description for Visual Speech Recognition in CNVSRC 2023
Viaarxiv icon

MLCA-AVSR: Multi-Layer Cross Attention Fusion based Audio-Visual Speech Recognition

Jan 07, 2024
He Wang, Pengcheng Guo, Pan Zhou, Lei Xie

Viaarxiv icon

Multilingual Visual Speech Recognition with a Single Model by Learning with Discrete Visual Speech Units

Jan 18, 2024
Minsu Kim, Jeong Hun Yeo, Jeongsoo Choi, Se Jin Park, Yong Man Ro

Viaarxiv icon

Efficient data selection employing Semantic Similarity-based Graph Structures for model training

Feb 22, 2024
Roxana Petcu, Subhadeep Maji

Viaarxiv icon