End To End Speech Recognition


End-to-end speech recognition is the process of transcribing speech directly into text without intermediate steps.

Towards LLM-Empowered Fine-Grained Speech Descriptors for Explainable Emotion Recognition

Add code
May 29, 2025
Viaarxiv icon

Prompting Whisper for Improved Verbatim Transcription and End-to-end Miscue Detection

Add code
May 29, 2025
Viaarxiv icon

Exploring Generative Error Correction for Dysarthric Speech Recognition

Add code
May 26, 2025
Viaarxiv icon

GMU Systems for the IWSLT 2025 Low-Resource Speech Translation Shared Task

Add code
May 27, 2025
Viaarxiv icon

Pretraining Multi-Speaker Identification for Neural Speaker Diarization

Add code
May 30, 2025
Viaarxiv icon

Overlap-Adaptive Hybrid Speaker Diarization and ASR-Aware Observation Addition for MISP 2025 Challenge

Add code
May 28, 2025
Viaarxiv icon

KIT's Low-resource Speech Translation Systems for IWSLT2025: System Enhancement with Synthetic Data and Model Regularization

Add code
May 26, 2025
Viaarxiv icon

Cross-modal Knowledge Transfer Learning as Graph Matching Based on Optimal Transport for ASR

Add code
May 19, 2025
Viaarxiv icon

An End-to-End Approach for Child Reading Assessment in the Xhosa Language

Add code
May 23, 2025
Viaarxiv icon

Word Level Timestamp Generation for Automatic Speech Recognition and Translation

Add code
May 21, 2025
Viaarxiv icon