speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

Large Language Model Data Generation for Enhanced Intent Recognition in German Speech

Add code
Aug 08, 2025
Viaarxiv icon

TurboBias: Universal ASR Context-Biasing powered by GPU-accelerated Phrase-Boosting Tree

Add code
Aug 12, 2025
Figure 1 for TurboBias: Universal ASR Context-Biasing powered by GPU-accelerated Phrase-Boosting Tree
Figure 2 for TurboBias: Universal ASR Context-Biasing powered by GPU-accelerated Phrase-Boosting Tree
Figure 3 for TurboBias: Universal ASR Context-Biasing powered by GPU-accelerated Phrase-Boosting Tree
Figure 4 for TurboBias: Universal ASR Context-Biasing powered by GPU-accelerated Phrase-Boosting Tree
Viaarxiv icon

A Small-footprint Acoustic Echo Cancellation Solution for Mobile Full-Duplex Speech Interactions

Add code
Aug 11, 2025
Viaarxiv icon

The TEA-ASLP System for Multilingual Conversational Speech Recognition and Speech Diarization in MLC-SLM 2025 Challenge

Add code
Jul 24, 2025
Viaarxiv icon

Munsit at NADI 2025 Shared Task 2: Pushing the Boundaries of Multidialectal Arabic ASR with Weakly Supervised Pretraining and Continual Supervised Fine-tuning

Add code
Aug 12, 2025
Viaarxiv icon

MOVER: Combining Multiple Meeting Recognition Systems

Add code
Aug 07, 2025
Figure 1 for MOVER: Combining Multiple Meeting Recognition Systems
Figure 2 for MOVER: Combining Multiple Meeting Recognition Systems
Figure 3 for MOVER: Combining Multiple Meeting Recognition Systems
Viaarxiv icon

Efficient Scaling for LLM-based ASR

Add code
Aug 06, 2025
Figure 1 for Efficient Scaling for LLM-based ASR
Figure 2 for Efficient Scaling for LLM-based ASR
Figure 3 for Efficient Scaling for LLM-based ASR
Figure 4 for Efficient Scaling for LLM-based ASR
Viaarxiv icon

Speech LLMs in Low-Resource Scenarios: Data Volume Requirements and the Impact of Pretraining on High-Resource Languages

Add code
Aug 07, 2025
Viaarxiv icon

System Report for CCL25-Eval Task 10: SRAG-MAV for Fine-Grained Chinese Hate Speech Recognition

Add code
Jul 24, 2025
Viaarxiv icon

NVSpeech: An Integrated and Scalable Pipeline for Human-Like Speech Modeling with Paralinguistic Vocalizations

Add code
Aug 06, 2025
Viaarxiv icon