speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

First Steps Towards Voice Anonymization for Code-Switching Speech

Add code
Jul 02, 2025
Viaarxiv icon

Privacy Disclosure of Similarity in Speech and Language Processing

Add code
Aug 07, 2025
Viaarxiv icon

A Novel Hybrid Deep Learning Technique for Speech Emotion Detection using Feature Engineering

Add code
Jul 09, 2025
Viaarxiv icon

StylOch at PAN: Gradient-Boosted Trees with Frequency-Based Stylometric Features

Add code
Jul 16, 2025
Viaarxiv icon

Benchmarking Akan ASR Models Across Domain-Specific Datasets: A Comparative Evaluation of Performance, Scalability, and Adaptability

Add code
Jul 03, 2025
Viaarxiv icon

Open-Source System for Multilingual Translation and Cloned Speech Synthesis

Add code
Jul 03, 2025
Viaarxiv icon

Improving Practical Aspects of End-to-End Multi-Talker Speech Recognition for Online and Offline Scenarios

Add code
Jun 17, 2025
Viaarxiv icon

Audio-Vision Contrastive Learning for Phonological Class Recognition

Add code
Jul 23, 2025
Viaarxiv icon

Learning from Random Subspace Exploration: Generalized Test-Time Augmentation with Self-supervised Distillation

Add code
Jul 02, 2025
Figure 1 for Learning from Random Subspace Exploration: Generalized Test-Time Augmentation with Self-supervised Distillation
Figure 2 for Learning from Random Subspace Exploration: Generalized Test-Time Augmentation with Self-supervised Distillation
Figure 3 for Learning from Random Subspace Exploration: Generalized Test-Time Augmentation with Self-supervised Distillation
Figure 4 for Learning from Random Subspace Exploration: Generalized Test-Time Augmentation with Self-supervised Distillation
Viaarxiv icon

Speech Tokenizer is Key to Consistent Representation

Add code
Jul 09, 2025
Viaarxiv icon