speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

PROFASR-BENCH: A Benchmark for Context-Conditioned ASR in High-Stakes Professional Speech

Add code
Dec 29, 2025
Viaarxiv icon

Contextual Biasing for LLM-Based ASR with Hotword Retrieval and Reinforcement Learning

Add code
Dec 26, 2025
Viaarxiv icon

Rare Word Recognition and Translation Without Fine-Tuning via Task Vector in Speech Models

Add code
Dec 26, 2025
Viaarxiv icon

VALLR-Pin: Dual-Decoding Visual Speech Recognition for Mandarin with Pinyin-Guided LLM Refinement

Add code
Dec 23, 2025
Viaarxiv icon

ElfCore: A 28nm Neural Processor Enabling Dynamic Structured Sparse Training and Online Self-Supervised Learning with Activity-Dependent Weight Update

Add code
Dec 24, 2025
Figure 1 for ElfCore: A 28nm Neural Processor Enabling Dynamic Structured Sparse Training and Online Self-Supervised Learning with Activity-Dependent Weight Update
Figure 2 for ElfCore: A 28nm Neural Processor Enabling Dynamic Structured Sparse Training and Online Self-Supervised Learning with Activity-Dependent Weight Update
Figure 3 for ElfCore: A 28nm Neural Processor Enabling Dynamic Structured Sparse Training and Online Self-Supervised Learning with Activity-Dependent Weight Update
Figure 4 for ElfCore: A 28nm Neural Processor Enabling Dynamic Structured Sparse Training and Online Self-Supervised Learning with Activity-Dependent Weight Update
Viaarxiv icon

Enhancing Fully Formatted End-to-End Speech Recognition with Knowledge Distillation via Multi-Codebook Vector Quantization

Add code
Dec 22, 2025
Figure 1 for Enhancing Fully Formatted End-to-End Speech Recognition with Knowledge Distillation via Multi-Codebook Vector Quantization
Figure 2 for Enhancing Fully Formatted End-to-End Speech Recognition with Knowledge Distillation via Multi-Codebook Vector Quantization
Figure 3 for Enhancing Fully Formatted End-to-End Speech Recognition with Knowledge Distillation via Multi-Codebook Vector Quantization
Figure 4 for Enhancing Fully Formatted End-to-End Speech Recognition with Knowledge Distillation via Multi-Codebook Vector Quantization
Viaarxiv icon

Semantic Codebooks as Effective Priors for Neural Speech Compression

Add code
Dec 25, 2025
Viaarxiv icon

TICL+: A Case Study On Speech In-Context Learning for Children's Speech Recognition

Add code
Dec 20, 2025
Viaarxiv icon

Phoneme-based speech recognition driven by large language models and sampling marginalization

Add code
Dec 20, 2025
Viaarxiv icon

From Speech to Subtitles: Evaluating ASR Models in Subtitling Italian Television Programs

Add code
Dec 22, 2025
Viaarxiv icon