speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

Self-Improvement for Audio Large Language Model using Unlabeled Speech

Add code
Jul 27, 2025
Viaarxiv icon

Continuous Saudi Sign Language Recognition: A Vision Transformer Approach

Add code
Sep 03, 2025
Viaarxiv icon

Tiny Noise-Robust Voice Activity Detector for Voice Assistants

Add code
Jul 29, 2025
Figure 1 for Tiny Noise-Robust Voice Activity Detector for Voice Assistants
Figure 2 for Tiny Noise-Robust Voice Activity Detector for Voice Assistants
Figure 3 for Tiny Noise-Robust Voice Activity Detector for Voice Assistants
Viaarxiv icon

Rethinking Tokenization for Rich Morphology: The Dominance of Unigram over BPE and Morphological Alignment

Add code
Aug 11, 2025
Viaarxiv icon

BoSS: Beyond-Semantic Speech

Add code
Jul 23, 2025
Viaarxiv icon

Step-Audio 2 Technical Report

Add code
Jul 24, 2025
Figure 1 for Step-Audio 2 Technical Report
Figure 2 for Step-Audio 2 Technical Report
Figure 3 for Step-Audio 2 Technical Report
Figure 4 for Step-Audio 2 Technical Report
Viaarxiv icon

Scene-Aware Vectorized Memory Multi-Agent Framework with Cross-Modal Differentiated Quantization VLMs for Visually Impaired Assistance

Add code
Aug 25, 2025
Viaarxiv icon

Multi-Target Backdoor Attacks Against Speaker Recognition

Add code
Aug 13, 2025
Viaarxiv icon

Privacy Disclosure of Similarity in Speech and Language Processing

Add code
Aug 07, 2025
Viaarxiv icon

Touch Speaks, Sound Feels: A Multimodal Approach to Affective and Social Touch from Robots to Humans

Add code
Aug 11, 2025
Viaarxiv icon