speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

HuBERT-VIC: Improving Noise-Robust Automatic Speech Recognition of Speech Foundation Model via Variance-Invariance-Covariance Regularization

Add code
Aug 17, 2025
Viaarxiv icon

Evaluating the Representation of Vowels in Wav2Vec Feature Extractor: A Layer-Wise Analysis Using MFCCs

Add code
Aug 25, 2025
Viaarxiv icon

Joint decoding method for controllable contextual speech recognition based on Speech LLM

Add code
Aug 12, 2025
Figure 1 for Joint decoding method for controllable contextual speech recognition based on Speech LLM
Figure 2 for Joint decoding method for controllable contextual speech recognition based on Speech LLM
Figure 3 for Joint decoding method for controllable contextual speech recognition based on Speech LLM
Figure 4 for Joint decoding method for controllable contextual speech recognition based on Speech LLM
Viaarxiv icon

Hybrid Decoding: Rapid Pass and Selective Detailed Correction for Sequence Models

Add code
Aug 27, 2025
Viaarxiv icon

Talking to Robots: A Practical Examination of Speech Foundation Models for HRI Applications

Add code
Aug 25, 2025
Figure 1 for Talking to Robots: A Practical Examination of Speech Foundation Models for HRI Applications
Viaarxiv icon

Generative Annotation for ASR Named Entity Correction

Add code
Aug 28, 2025
Figure 1 for Generative Annotation for ASR Named Entity Correction
Figure 2 for Generative Annotation for ASR Named Entity Correction
Figure 3 for Generative Annotation for ASR Named Entity Correction
Figure 4 for Generative Annotation for ASR Named Entity Correction
Viaarxiv icon

Bridging ASR and LLMs for Dysarthric Speech Recognition: Benchmarking Self-Supervised and Generative Approaches

Add code
Aug 11, 2025
Figure 1 for Bridging ASR and LLMs for Dysarthric Speech Recognition: Benchmarking Self-Supervised and Generative Approaches
Figure 2 for Bridging ASR and LLMs for Dysarthric Speech Recognition: Benchmarking Self-Supervised and Generative Approaches
Figure 3 for Bridging ASR and LLMs for Dysarthric Speech Recognition: Benchmarking Self-Supervised and Generative Approaches
Figure 4 for Bridging ASR and LLMs for Dysarthric Speech Recognition: Benchmarking Self-Supervised and Generative Approaches
Viaarxiv icon

NSPDI-SNN: An efficient lightweight SNN based on nonlinear synaptic pruning and dendritic integration

Add code
Aug 29, 2025
Viaarxiv icon

Attention2Probability: Attention-Driven Terminology Probability Estimation for Robust Speech-to-Text System

Add code
Aug 26, 2025
Viaarxiv icon

Can Layer-wise SSL Features Improve Zero-Shot ASR Performance for Children's Speech?

Add code
Aug 28, 2025
Viaarxiv icon