speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

AI Meets Maritime Training: Precision Analytics for Enhanced Safety and Performance

Add code
Jul 02, 2025
Viaarxiv icon

Thinking in Directivity: Speech Large Language Model for Multi-Talker Directional Speech Recognition

Add code
Jun 17, 2025
Viaarxiv icon

First Steps Towards Voice Anonymization for Code-Switching Speech

Add code
Jul 02, 2025
Viaarxiv icon

StylOch at PAN: Gradient-Boosted Trees with Frequency-Based Stylometric Features

Add code
Jul 16, 2025
Viaarxiv icon

Audio-Vision Contrastive Learning for Phonological Class Recognition

Add code
Jul 23, 2025
Viaarxiv icon

A Novel Hybrid Deep Learning Technique for Speech Emotion Detection using Feature Engineering

Add code
Jul 09, 2025
Viaarxiv icon

Benchmarking Akan ASR Models Across Domain-Specific Datasets: A Comparative Evaluation of Performance, Scalability, and Adaptability

Add code
Jul 03, 2025
Viaarxiv icon

Open-Source System for Multilingual Translation and Cloned Speech Synthesis

Add code
Jul 03, 2025
Viaarxiv icon

Speech Tokenizer is Key to Consistent Representation

Add code
Jul 09, 2025
Viaarxiv icon

Learning from Random Subspace Exploration: Generalized Test-Time Augmentation with Self-supervised Distillation

Add code
Jul 02, 2025
Figure 1 for Learning from Random Subspace Exploration: Generalized Test-Time Augmentation with Self-supervised Distillation
Figure 2 for Learning from Random Subspace Exploration: Generalized Test-Time Augmentation with Self-supervised Distillation
Figure 3 for Learning from Random Subspace Exploration: Generalized Test-Time Augmentation with Self-supervised Distillation
Figure 4 for Learning from Random Subspace Exploration: Generalized Test-Time Augmentation with Self-supervised Distillation
Viaarxiv icon