speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

AmbER$^2$: Dual Ambiguity-Aware Emotion Recognition Applied to Speech and Text

Add code
Jan 25, 2026
Viaarxiv icon

Noise-Robust AV-ASR Using Visual Features Both in the Whisper Encoder and Decoder

Add code
Jan 26, 2026
Viaarxiv icon

End-to-End Joint ASR and Speaker Role Diarization with Child-Adult Interactions

Add code
Jan 25, 2026
Viaarxiv icon

Purification Before Fusion: Toward Mask-Free Speech Enhancement for Robust Audio-Visual Speech Recognition

Add code
Jan 18, 2026
Viaarxiv icon

Typhoon ASR Real-time: FastConformer-Transducer for Thai Automatic Speech Recognition

Add code
Jan 19, 2026
Viaarxiv icon

Benchmarking von ASR-Modellen im deutschen medizinischen Kontext: Eine Leistungsanalyse anhand von Anamnesegesprächen

Add code
Jan 23, 2026
Viaarxiv icon

DementiaBank-Emotion: A Multi-Rater Emotion Annotation Corpus for Alzheimer's Disease Speech (Version 1.0)

Add code
Feb 04, 2026
Viaarxiv icon

From Human Speech to Ocean Signals: Transferring Speech Large Models for Underwater Acoustic Target Recognition

Add code
Jan 26, 2026
Viaarxiv icon

Scaling Ambiguity: Augmenting Human Annotation in Speech Emotion Recognition with Audio-Language Models

Add code
Jan 21, 2026
Viaarxiv icon

SSVD-O: Parameter-Efficient Fine-Tuning with Structured SVD for Speech Recognition

Add code
Jan 18, 2026
Viaarxiv icon