speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

Color-based Emotion Representation for Speech Emotion Recognition

Add code
Feb 18, 2026
Viaarxiv icon

An Approach to Combining Video and Speech with Large Language Models in Human-Robot Interaction

Add code
Feb 23, 2026
Viaarxiv icon

A Mixture-of-Experts Model for Multimodal Emotion Recognition in Conversations

Add code
Feb 26, 2026
Viaarxiv icon

Whisper: Courtside Edition Enhancing ASR Performance Through LLM-Driven Context Generation

Add code
Feb 21, 2026
Viaarxiv icon

Voice-Driven Semantic Perception for UAV-Assisted Emergency Networks

Add code
Feb 19, 2026
Viaarxiv icon

Joint Enhancement and Classification using Coupled Diffusion Models of Signals and Logits

Add code
Feb 17, 2026
Viaarxiv icon

Enroll-on-Wakeup: A First Comparative Study of Target Speech Extraction for Seamless Interaction in Real Noisy Human-Machine Dialogue Scenarios

Add code
Feb 17, 2026
Viaarxiv icon

TurkicNLP: An NLP Toolkit for Turkic Languages

Add code
Feb 22, 2026
Viaarxiv icon

ViSpeechFormer: A Phonemic Approach for Vietnamese Automatic Speech Recognition

Add code
Feb 10, 2026
Viaarxiv icon

Speech to Speech Synthesis for Voice Impersonation

Add code
Feb 13, 2026
Viaarxiv icon