speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

Adapting Text LLMs to Speech via Multimodal Depth Up-Scaling

Add code
Apr 01, 2026
Viaarxiv icon

Speech LLMs are Contextual Reasoning Transcribers

Add code
Apr 01, 2026
Viaarxiv icon

Advancing LLM-based phoneme-to-grapheme for multilingual speech recognition

Add code
Mar 31, 2026
Viaarxiv icon

VisG AV-HuBERT: Viseme-Guided AV-HuBERT

Add code
Apr 01, 2026
Viaarxiv icon

FLEURS-Kobani: Extending the FLEURS Dataset for Northern Kurdish

Add code
Mar 31, 2026
Viaarxiv icon

LLM Probe: Evaluating LLMs for Low-Resource Languages

Add code
Mar 31, 2026
Viaarxiv icon

EBuddy: a workflow orchestrator for industrial human-machine collaboration

Add code
Mar 30, 2026
Viaarxiv icon

Transcription and Recognition of Italian Parliamentary Speeches Using Vision-Language Models

Add code
Mar 30, 2026
Viaarxiv icon

On the Role of Encoder Depth: Pruning Whisper and LoRA Fine-Tuning in SLAM-ASR

Add code
Mar 30, 2026
Viaarxiv icon

Automatic Speech Recognition for Documenting Endangered Languages: Case Study of Ikema Miyakoan

Add code
Mar 27, 2026
Viaarxiv icon