speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

Evaluation of Generative Models for Emotional 3D Animation Generation in VR

Add code
Dec 18, 2025
Viaarxiv icon

DASH: Dialogue-Aware Similarity and Handshake Recognition for Topic Segmentation in Public-Channel Conversations

Add code
Dec 17, 2025
Viaarxiv icon

NagaNLP: Bootstrapping NLP for Low-Resource Nagamese Creole with Human-in-the-Loop Synthetic Data

Add code
Dec 14, 2025
Viaarxiv icon

ImageTalk: Designing a Multimodal AAC Text Generation System Driven by Image Recognition and Natural Language Generation

Add code
Dec 10, 2025
Viaarxiv icon

Poster: Recognizing Hidden-in-the-Ear Private Key for Reliable Silent Speech Interface Using Multi-Task Learning

Add code
Dec 18, 2025
Viaarxiv icon

TTA: Transcribe, Translate and Alignment for Cross-lingual Speech Representation

Add code
Nov 18, 2025
Viaarxiv icon

AfriSpeech-MultiBench: A Verticalized Multidomain Multicountry Benchmark Suite for African Accented English ASR

Add code
Nov 18, 2025
Viaarxiv icon

Listen Like a Teacher: Mitigating Whisper Hallucinations using Adaptive Layer Attention and Knowledge Distillation

Add code
Nov 18, 2025
Viaarxiv icon

Ground Truth Generation for Multilingual Historical NLP using LLMs

Add code
Nov 18, 2025
Figure 1 for Ground Truth Generation for Multilingual Historical NLP using LLMs
Figure 2 for Ground Truth Generation for Multilingual Historical NLP using LLMs
Figure 3 for Ground Truth Generation for Multilingual Historical NLP using LLMs
Figure 4 for Ground Truth Generation for Multilingual Historical NLP using LLMs
Viaarxiv icon

Scriboora: Rethinking Human Pose Forecasting

Add code
Nov 19, 2025
Figure 1 for Scriboora: Rethinking Human Pose Forecasting
Figure 2 for Scriboora: Rethinking Human Pose Forecasting
Figure 3 for Scriboora: Rethinking Human Pose Forecasting
Figure 4 for Scriboora: Rethinking Human Pose Forecasting
Viaarxiv icon