Speech Recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

Qwen vs. Gemma Integration with Whisper: A Comparative Study in Multilingual SpeechLLM Systems

Add code
Jun 16, 2025
Viaarxiv icon

NTU Speechlab LLM-Based Multilingual ASR System for Interspeech MLC-SLM Challenge 2025

Add code
Jun 16, 2025
Viaarxiv icon

Seewo's Submission to MLC-SLM: Lessons learned from Speech Reasoning Language Models

Add code
Jun 16, 2025
Viaarxiv icon

Bi-directional Context-Enhanced Speech Large Language Models for Multilingual Conversational ASR

Add code
Jun 16, 2025
Viaarxiv icon

BUT System for the MLC-SLM Challenge

Add code
Jun 16, 2025
Viaarxiv icon

SC-SOT: Conditioning the Decoder on Diarized Speaker Information for End-to-End Overlapped Speech Recognition

Add code
Jun 15, 2025
Viaarxiv icon

From Flat to Feeling: A Feasibility and Impact Study on Dynamic Facial Emotions in AI-Generated Avatars

Add code
Jun 16, 2025
Viaarxiv icon

Adapting Whisper for Streaming Speech Recognition via Two-Pass Decoding

Add code
Jun 13, 2025
Viaarxiv icon

Lightweight and Robust Multi-Channel End-to-End Speech Recognition with Spherical Harmonic Transform

Add code
Jun 13, 2025
Viaarxiv icon

(SimPhon Speech Test): A Data-Driven Method for In Silico Design and Validation of a Phonetically Balanced Speech Test

Add code
Jun 13, 2025
Viaarxiv icon