speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

PTS-SNN: A Prompt-Tuned Temporal Shift Spiking Neural Networks for Efficient Speech Emotion Recognition

Add code
Feb 09, 2026
Viaarxiv icon

TC-BiMamba: Trans-Chunk bidirectionally within BiMamba for unified streaming and non-streaming ASR

Add code
Feb 12, 2026
Viaarxiv icon

ViMedCSS: A Vietnamese Medical Code-Switching Speech Dataset & Benchmark

Add code
Feb 13, 2026
Viaarxiv icon

PISHYAR: A Socially Intelligent Smart Cane for Indoor Social Navigation and Multimodal Human-Robot Interaction for Visually Impaired People

Add code
Feb 13, 2026
Viaarxiv icon

Prototype-Based Disentanglement for Controllable Dysarthric Speech Synthesis

Add code
Feb 09, 2026
Viaarxiv icon

Beyond the Utterance: An Empirical Study of Very Long Context Speech Recognition

Add code
Feb 04, 2026
Viaarxiv icon

Moonshine v2: Ergodic Streaming Encoder ASR for Latency-Critical Speech Applications

Add code
Feb 12, 2026
Viaarxiv icon

B-GRPO: Unsupervised Speech Emotion Recognition based on Batched-Group Relative Policy Optimization

Add code
Feb 06, 2026
Viaarxiv icon

Self-Supervised Learning for Speaker Recognition: A study and review

Add code
Feb 11, 2026
Viaarxiv icon

Speech Emotion Recognition Leveraging OpenAI's Whisper Representations and Attentive Pooling Methods

Add code
Feb 05, 2026
Viaarxiv icon