speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

Speaker-Reasoner: Scaling Interaction Turns and Reasoning Patterns for Timestamped Speaker-Attributed ASR

Add code
Apr 03, 2026
Viaarxiv icon

Advancing LLM-based phoneme-to-grapheme for multilingual speech recognition

Add code
Mar 31, 2026
Viaarxiv icon

CV-18 NER: Augmented Common Voice for Named Entity Recognition from Arabic Speech

Add code
Apr 02, 2026
Viaarxiv icon

Adapting Text LLMs to Speech via Multimodal Depth Up-Scaling

Add code
Apr 01, 2026
Viaarxiv icon

Speech LLMs are Contextual Reasoning Transcribers

Add code
Apr 01, 2026
Viaarxiv icon

FLEURS-Kobani: Extending the FLEURS Dataset for Northern Kurdish

Add code
Mar 31, 2026
Viaarxiv icon

VisG AV-HuBERT: Viseme-Guided AV-HuBERT

Add code
Apr 01, 2026
Viaarxiv icon

TASU2: Controllable CTC Simulation for Alignment and Low-Resource Adaptation of Speech LLMs

Add code
Apr 09, 2026
Viaarxiv icon

Automatic Speech Recognition for Documenting Endangered Languages: Case Study of Ikema Miyakoan

Add code
Mar 27, 2026
Viaarxiv icon

LLM Probe: Evaluating LLMs for Low-Resource Languages

Add code
Mar 31, 2026
Viaarxiv icon