speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

Voxtral Realtime

Add code
Feb 11, 2026
Viaarxiv icon

ViMedCSS: A Vietnamese Medical Code-Switching Speech Dataset & Benchmark

Add code
Feb 13, 2026
Viaarxiv icon

TC-BiMamba: Trans-Chunk bidirectionally within BiMamba for unified streaming and non-streaming ASR

Add code
Feb 12, 2026
Viaarxiv icon

PISHYAR: A Socially Intelligent Smart Cane for Indoor Social Navigation and Multimodal Human-Robot Interaction for Visually Impaired People

Add code
Feb 13, 2026
Viaarxiv icon

PTS-SNN: A Prompt-Tuned Temporal Shift Spiking Neural Networks for Efficient Speech Emotion Recognition

Add code
Feb 09, 2026
Viaarxiv icon

Moonshine v2: Ergodic Streaming Encoder ASR for Latency-Critical Speech Applications

Add code
Feb 12, 2026
Viaarxiv icon

Enabling Automatic Disordered Speech Recognition: An Impaired Speech Dataset in the Akan Language

Add code
Feb 05, 2026
Viaarxiv icon

Self-Supervised Learning for Speaker Recognition: A study and review

Add code
Feb 11, 2026
Viaarxiv icon

Prototype-Based Disentanglement for Controllable Dysarthric Speech Synthesis

Add code
Feb 09, 2026
Viaarxiv icon

Frontend Token Enhancement for Token-Based Speech Recognition

Add code
Feb 04, 2026
Viaarxiv icon