speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

Contextual Biasing for LLM-Based ASR with Hotword Retrieval and Reinforcement Learning

Add code
Dec 26, 2025
Figure 1 for Contextual Biasing for LLM-Based ASR with Hotword Retrieval and Reinforcement Learning
Figure 2 for Contextual Biasing for LLM-Based ASR with Hotword Retrieval and Reinforcement Learning
Figure 3 for Contextual Biasing for LLM-Based ASR with Hotword Retrieval and Reinforcement Learning
Figure 4 for Contextual Biasing for LLM-Based ASR with Hotword Retrieval and Reinforcement Learning
Viaarxiv icon

Enhancing Fully Formatted End-to-End Speech Recognition with Knowledge Distillation via Multi-Codebook Vector Quantization

Add code
Dec 22, 2025
Figure 1 for Enhancing Fully Formatted End-to-End Speech Recognition with Knowledge Distillation via Multi-Codebook Vector Quantization
Figure 2 for Enhancing Fully Formatted End-to-End Speech Recognition with Knowledge Distillation via Multi-Codebook Vector Quantization
Figure 3 for Enhancing Fully Formatted End-to-End Speech Recognition with Knowledge Distillation via Multi-Codebook Vector Quantization
Figure 4 for Enhancing Fully Formatted End-to-End Speech Recognition with Knowledge Distillation via Multi-Codebook Vector Quantization
Viaarxiv icon

TICL+: A Case Study On Speech In-Context Learning for Children's Speech Recognition

Add code
Dec 20, 2025
Viaarxiv icon

Phoneme-based speech recognition driven by large language models and sampling marginalization

Add code
Dec 20, 2025
Viaarxiv icon

ElfCore: A 28nm Neural Processor Enabling Dynamic Structured Sparse Training and Online Self-Supervised Learning with Activity-Dependent Weight Update

Add code
Dec 24, 2025
Figure 1 for ElfCore: A 28nm Neural Processor Enabling Dynamic Structured Sparse Training and Online Self-Supervised Learning with Activity-Dependent Weight Update
Figure 2 for ElfCore: A 28nm Neural Processor Enabling Dynamic Structured Sparse Training and Online Self-Supervised Learning with Activity-Dependent Weight Update
Figure 3 for ElfCore: A 28nm Neural Processor Enabling Dynamic Structured Sparse Training and Online Self-Supervised Learning with Activity-Dependent Weight Update
Figure 4 for ElfCore: A 28nm Neural Processor Enabling Dynamic Structured Sparse Training and Online Self-Supervised Learning with Activity-Dependent Weight Update
Viaarxiv icon

Rare Word Recognition and Translation Without Fine-Tuning via Task Vector in Speech Models

Add code
Dec 26, 2025
Viaarxiv icon

Explainable Transformer-CNN Fusion for Noise-Robust Speech Emotion Recognition

Add code
Dec 20, 2025
Figure 1 for Explainable Transformer-CNN Fusion for Noise-Robust Speech Emotion Recognition
Figure 2 for Explainable Transformer-CNN Fusion for Noise-Robust Speech Emotion Recognition
Figure 3 for Explainable Transformer-CNN Fusion for Noise-Robust Speech Emotion Recognition
Figure 4 for Explainable Transformer-CNN Fusion for Noise-Robust Speech Emotion Recognition
Viaarxiv icon

Zero-Shot Recognition of Dysarthric Speech Using Commercial Automatic Speech Recognition and Multimodal Large Language Models

Add code
Dec 19, 2025
Figure 1 for Zero-Shot Recognition of Dysarthric Speech Using Commercial Automatic Speech Recognition and Multimodal Large Language Models
Figure 2 for Zero-Shot Recognition of Dysarthric Speech Using Commercial Automatic Speech Recognition and Multimodal Large Language Models
Figure 3 for Zero-Shot Recognition of Dysarthric Speech Using Commercial Automatic Speech Recognition and Multimodal Large Language Models
Figure 4 for Zero-Shot Recognition of Dysarthric Speech Using Commercial Automatic Speech Recognition and Multimodal Large Language Models
Viaarxiv icon

Incorporating Error Level Noise Embedding for Improving LLM-Assisted Robustness in Persian Speech Recognition

Add code
Dec 19, 2025
Viaarxiv icon

From Speech to Subtitles: Evaluating ASR Models in Subtitling Italian Television Programs

Add code
Dec 22, 2025
Viaarxiv icon