speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

EBuddy: a workflow orchestrator for industrial human-machine collaboration

Add code
Mar 30, 2026
Viaarxiv icon

Transcription and Recognition of Italian Parliamentary Speeches Using Vision-Language Models

Add code
Mar 30, 2026
Viaarxiv icon

On the Role of Encoder Depth: Pruning Whisper and LoRA Fine-Tuning in SLAM-ASR

Add code
Mar 30, 2026
Viaarxiv icon

Cascade-Free Mandarin Visual Speech Recognition via Semantic-Guided Cross-Representation Alignment

Add code
Mar 23, 2026
Viaarxiv icon

Prompt Amplification and Zero-Shot Late Fusion in Audio-Language Models for Speech Emotion Recognition

Add code
Mar 24, 2026
Viaarxiv icon

Distilling Conversations: Abstract Compression of Conversational Audio Context for LLM-based ASR

Add code
Mar 27, 2026
Viaarxiv icon

Over-the-air White-box Attack on the Wav2Vec Speech Recognition Neural Network

Add code
Mar 17, 2026
Viaarxiv icon

How Class Ontology and Data Scale Affect Audio Transfer Learning

Add code
Mar 26, 2026
Viaarxiv icon

Goodness-of-pronunciation without phoneme time alignment

Add code
Mar 26, 2026
Viaarxiv icon

Crab: Multi Layer Contrastive Supervision to Improve Speech Emotion Recognition Under Both Acted and Natural Speech Condition

Add code
Mar 24, 2026
Viaarxiv icon