speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

Stuttering-Aware Automatic Speech Recognition for Indonesian Language

Add code
Jan 07, 2026
Viaarxiv icon

WESR: Scaling and Evaluating Word-level Event-Speech Recognition

Add code
Jan 08, 2026
Viaarxiv icon

Robust CAPTCHA Using Audio Illusions in the Era of Large Language Models: from Evaluation to Advances

Add code
Jan 13, 2026
Viaarxiv icon

SLAM-LLM: A Modular, Open-Source Multimodal Large Language Model Framework and Best Practice for Speech, Language, Audio and Music Processing

Add code
Jan 14, 2026
Viaarxiv icon

LLMs-Integrated Automatic Hate Speech Recognition Using Controllable Text Generation Models

Add code
Jan 08, 2026
Viaarxiv icon

ParaMETA: Towards Learning Disentangled Paralinguistic Speaking Styles Representations from Speech

Add code
Jan 18, 2026
Viaarxiv icon

Multi-channel multi-speaker transformer for speech recognition

Add code
Jan 06, 2026
Viaarxiv icon

RIR-Mega-Speech: A Reverberant Speech Corpus with Comprehensive Acoustic Metadata and Reproducible Evaluation

Add code
Jan 25, 2026
Viaarxiv icon

Task Arithmetic with Support Languages for Low-Resource ASR

Add code
Jan 11, 2026
Viaarxiv icon

TidyVoice: A Curated Multilingual Dataset for Speaker Verification Derived from Common Voice

Add code
Jan 22, 2026
Viaarxiv icon