speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

E2E-VGuard: Adversarial Prevention for Production LLM-based End-To-End Speech Synthesis

Add code
Nov 10, 2025
Figure 1 for E2E-VGuard: Adversarial Prevention for Production LLM-based End-To-End Speech Synthesis
Figure 2 for E2E-VGuard: Adversarial Prevention for Production LLM-based End-To-End Speech Synthesis
Figure 3 for E2E-VGuard: Adversarial Prevention for Production LLM-based End-To-End Speech Synthesis
Figure 4 for E2E-VGuard: Adversarial Prevention for Production LLM-based End-To-End Speech Synthesis
Viaarxiv icon

Mitigating Attention Sinks and Massive Activations in Audio-Visual Speech Recognition with LLMS

Add code
Oct 26, 2025
Viaarxiv icon

Ground Truth Generation for Multilingual Historical NLP using LLMs

Add code
Nov 18, 2025
Figure 1 for Ground Truth Generation for Multilingual Historical NLP using LLMs
Figure 2 for Ground Truth Generation for Multilingual Historical NLP using LLMs
Figure 3 for Ground Truth Generation for Multilingual Historical NLP using LLMs
Figure 4 for Ground Truth Generation for Multilingual Historical NLP using LLMs
Viaarxiv icon

CantoASR: Prosody-Aware ASR-LALM Collaboration for Low-Resource Cantonese

Add code
Nov 06, 2025
Viaarxiv icon

Enabling Automatic Self-Talk Detection via Earables

Add code
Nov 10, 2025
Figure 1 for Enabling Automatic Self-Talk Detection via Earables
Figure 2 for Enabling Automatic Self-Talk Detection via Earables
Figure 3 for Enabling Automatic Self-Talk Detection via Earables
Figure 4 for Enabling Automatic Self-Talk Detection via Earables
Viaarxiv icon

LRW-Persian: Lip-reading in the Wild Dataset for Persian Language

Add code
Oct 26, 2025
Viaarxiv icon

Overview of the MEDIQA-OE 2025 Shared Task on Medical Order Extraction from Doctor-Patient Consultations

Add code
Oct 30, 2025
Viaarxiv icon

HMM for short independent sequences: Multiple sequence Baum-Welch application

Add code
Oct 30, 2025
Viaarxiv icon

Reference Microphone Selection for Guided Source Separation based on the Normalized L-p Norm

Add code
Oct 31, 2025
Figure 1 for Reference Microphone Selection for Guided Source Separation based on the Normalized L-p Norm
Figure 2 for Reference Microphone Selection for Guided Source Separation based on the Normalized L-p Norm
Figure 3 for Reference Microphone Selection for Guided Source Separation based on the Normalized L-p Norm
Figure 4 for Reference Microphone Selection for Guided Source Separation based on the Normalized L-p Norm
Viaarxiv icon

The Tonogenesis Continuum in Tibetan: A Computational Investigation

Add code
Oct 26, 2025
Figure 1 for The Tonogenesis Continuum in Tibetan: A Computational Investigation
Figure 2 for The Tonogenesis Continuum in Tibetan: A Computational Investigation
Figure 3 for The Tonogenesis Continuum in Tibetan: A Computational Investigation
Figure 4 for The Tonogenesis Continuum in Tibetan: A Computational Investigation
Viaarxiv icon