speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

Spoken in Jest, Detected in Earnest: A Systematic Review of Sarcasm Recognition -- Multimodal Fusion, Challenges, and Future Prospects

Add code
Sep 04, 2025
Viaarxiv icon

Objective and Subjective Evaluation of Diffusion-Based Speech Enhancement for Dysarthric Speech

Add code
Aug 25, 2025
Viaarxiv icon

HuBERT-VIC: Improving Noise-Robust Automatic Speech Recognition of Speech Foundation Model via Variance-Invariance-Covariance Regularization

Add code
Aug 17, 2025
Viaarxiv icon

Evaluating the Representation of Vowels in Wav2Vec Feature Extractor: A Layer-Wise Analysis Using MFCCs

Add code
Aug 25, 2025
Viaarxiv icon

Hybrid Decoding: Rapid Pass and Selective Detailed Correction for Sequence Models

Add code
Aug 27, 2025
Viaarxiv icon

Talking to Robots: A Practical Examination of Speech Foundation Models for HRI Applications

Add code
Aug 25, 2025
Figure 1 for Talking to Robots: A Practical Examination of Speech Foundation Models for HRI Applications
Viaarxiv icon

Joint decoding method for controllable contextual speech recognition based on Speech LLM

Add code
Aug 12, 2025
Figure 1 for Joint decoding method for controllable contextual speech recognition based on Speech LLM
Figure 2 for Joint decoding method for controllable contextual speech recognition based on Speech LLM
Figure 3 for Joint decoding method for controllable contextual speech recognition based on Speech LLM
Figure 4 for Joint decoding method for controllable contextual speech recognition based on Speech LLM
Viaarxiv icon

Generative Annotation for ASR Named Entity Correction

Add code
Aug 28, 2025
Figure 1 for Generative Annotation for ASR Named Entity Correction
Figure 2 for Generative Annotation for ASR Named Entity Correction
Figure 3 for Generative Annotation for ASR Named Entity Correction
Figure 4 for Generative Annotation for ASR Named Entity Correction
Viaarxiv icon

NSPDI-SNN: An efficient lightweight SNN based on nonlinear synaptic pruning and dendritic integration

Add code
Aug 29, 2025
Viaarxiv icon

NE-PADD: Leveraging Named Entity Knowledge for Robust Partial Audio Deepfake Detection via Attention Aggregation

Add code
Sep 04, 2025
Figure 1 for NE-PADD: Leveraging Named Entity Knowledge for Robust Partial Audio Deepfake Detection via Attention Aggregation
Figure 2 for NE-PADD: Leveraging Named Entity Knowledge for Robust Partial Audio Deepfake Detection via Attention Aggregation
Figure 3 for NE-PADD: Leveraging Named Entity Knowledge for Robust Partial Audio Deepfake Detection via Attention Aggregation
Figure 4 for NE-PADD: Leveraging Named Entity Knowledge for Robust Partial Audio Deepfake Detection via Attention Aggregation
Viaarxiv icon