speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

Cross-Modal Bottleneck Fusion For Noise Robust Audio-Visual Speech Recognition

Add code
Feb 09, 2026
Viaarxiv icon

On the Sensitivity of Firing Rate-Based Federated Spiking Neural Networks to Differential Privacy

Add code
Feb 12, 2026
Viaarxiv icon

PISHYAR: A Socially Intelligent Smart Cane for Indoor Social Navigation and Multimodal Human-Robot Interaction for Visually Impaired People

Add code
Feb 13, 2026
Viaarxiv icon

TC-BiMamba: Trans-Chunk bidirectionally within BiMamba for unified streaming and non-streaming ASR

Add code
Feb 12, 2026
Viaarxiv icon

Voxtral Realtime

Add code
Feb 11, 2026
Viaarxiv icon

Moonshine v2: Ergodic Streaming Encoder ASR for Latency-Critical Speech Applications

Add code
Feb 12, 2026
Viaarxiv icon

Self-Supervised Learning for Speaker Recognition: A study and review

Add code
Feb 11, 2026
Viaarxiv icon

PTS-SNN: A Prompt-Tuned Temporal Shift Spiking Neural Networks for Efficient Speech Emotion Recognition

Add code
Feb 09, 2026
Viaarxiv icon

RE-LLM: Refining Empathetic Speech-LLM Responses by Integrating Emotion Nuance

Add code
Feb 11, 2026
Viaarxiv icon

Prototype-Based Disentanglement for Controllable Dysarthric Speech Synthesis

Add code
Feb 09, 2026
Viaarxiv icon