speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

Speaker-Reasoner: Scaling Interaction Turns and Reasoning Patterns for Timestamped Speaker-Attributed ASR

Add code
Apr 03, 2026
Viaarxiv icon

CV-18 NER: Augmented Common Voice for Named Entity Recognition from Arabic Speech

Add code
Apr 02, 2026
Viaarxiv icon

Automatic Speech Recognition for Documenting Endangered Languages: Case Study of Ikema Miyakoan

Add code
Mar 27, 2026
Viaarxiv icon

Benchmarking Multilingual Speech Models on Pashto: Zero-Shot ASR, Script Failure, and Cross-Domain Evaluation

Add code
Apr 06, 2026
Viaarxiv icon

FLEURS-Kobani: Extending the FLEURS Dataset for Northern Kurdish

Add code
Mar 31, 2026
Viaarxiv icon

AdaLTM: Adaptive Layer-wise Task Vector Merging for Categorical Speech Emotion Recognition with ASR Knowledge Integration

Add code
Mar 26, 2026
Viaarxiv icon

A Sociolinguistic Analysis of Automatic Speech Recognition Bias in Newcastle English

Add code
Mar 25, 2026
Viaarxiv icon

VisG AV-HuBERT: Viseme-Guided AV-HuBERT

Add code
Apr 01, 2026
Viaarxiv icon

SEDTalker: Emotion-Aware 3D Facial Animation Using Frame-Level Speech Emotion Diarization

Add code
Apr 14, 2026
Viaarxiv icon

EBuddy: a workflow orchestrator for industrial human-machine collaboration

Add code
Mar 30, 2026
Viaarxiv icon