speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

Streaming Sequence-to-Sequence Learning with Delayed Streams Modeling

Add code
Sep 10, 2025
Viaarxiv icon

A Bottom-up Framework with Language-universal Speech Attribute Modeling for Syllable-based ASR

Add code
Sep 09, 2025
Viaarxiv icon

CAMÕES: A Comprehensive Automatic Speech Recognition Benchmark for European Portuguese

Add code
Aug 27, 2025
Viaarxiv icon

A Study of the Removability of Speaker-Adversarial Perturbations

Add code
Oct 10, 2025
Figure 1 for A Study of the Removability of Speaker-Adversarial Perturbations
Figure 2 for A Study of the Removability of Speaker-Adversarial Perturbations
Figure 3 for A Study of the Removability of Speaker-Adversarial Perturbations
Figure 4 for A Study of the Removability of Speaker-Adversarial Perturbations
Viaarxiv icon

LatPhon: Lightweight Multilingual G2P for Romance Languages and English

Add code
Sep 03, 2025
Figure 1 for LatPhon: Lightweight Multilingual G2P for Romance Languages and English
Figure 2 for LatPhon: Lightweight Multilingual G2P for Romance Languages and English
Figure 3 for LatPhon: Lightweight Multilingual G2P for Romance Languages and English
Figure 4 for LatPhon: Lightweight Multilingual G2P for Romance Languages and English
Viaarxiv icon

Improving Noise Robust Audio-Visual Speech Recognition via Router-Gated Cross-Modal Feature Fusion

Add code
Aug 26, 2025
Viaarxiv icon

Designing Practical Models for Isolated Word Visual Speech Recognition

Add code
Aug 25, 2025
Viaarxiv icon

Layer-wise Analysis for Quality of Multilingual Synthesized Speech

Add code
Sep 05, 2025
Viaarxiv icon

Contextualized Token Discrimination for Speech Search Query Correction

Add code
Sep 04, 2025
Figure 1 for Contextualized Token Discrimination for Speech Search Query Correction
Figure 2 for Contextualized Token Discrimination for Speech Search Query Correction
Figure 3 for Contextualized Token Discrimination for Speech Search Query Correction
Figure 4 for Contextualized Token Discrimination for Speech Search Query Correction
Viaarxiv icon

Speech Emotion Recognition via Entropy-Aware Score Selection

Add code
Aug 28, 2025
Viaarxiv icon