speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

PAC: Pronunciation-Aware Contextualized Large Language Model-based Automatic Speech Recognition

Add code
Sep 16, 2025
Viaarxiv icon

FunAudio-ASR Technical Report

Add code
Sep 15, 2025
Viaarxiv icon

Behind the Scenes: Mechanistic Interpretability of LoRA-adapted Whisper for Speech Emotion Recognition

Add code
Sep 11, 2025
Viaarxiv icon

Few-shot Personalization via In-Context Learning for Speech Emotion Recognition based on Speech-Language Model

Add code
Sep 10, 2025
Viaarxiv icon

Joint Learning using Mixture-of-Expert-Based Representation for Enhanced Speech Generation and Robust Emotion Recognition

Add code
Sep 10, 2025
Viaarxiv icon

Streaming Sequence-to-Sequence Learning with Delayed Streams Modeling

Add code
Sep 10, 2025
Viaarxiv icon

Identifying and Calibrating Overconfidence in Noisy Speech Recognition

Add code
Sep 08, 2025
Viaarxiv icon

A Bottom-up Framework with Language-universal Speech Attribute Modeling for Syllable-based ASR

Add code
Sep 09, 2025
Viaarxiv icon

EnvX: Agentize Everything with Agentic AI

Add code
Sep 09, 2025
Viaarxiv icon

Layer-wise Analysis for Quality of Multilingual Synthesized Speech

Add code
Sep 05, 2025
Viaarxiv icon