speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

PRiSM: Benchmarking Phone Realization in Speech Models

Add code
Jan 20, 2026
Viaarxiv icon

A Baseline Multimodal Approach to Emotion Recognition in Conversations

Add code
Jan 31, 2026
Viaarxiv icon

Motion-to-Response Content Generation via Multi-Agent AI System with Real-Time Safety Verification

Add code
Jan 20, 2026
Viaarxiv icon

Beyond Mapping : Domain-Invariant Representations via Spectral Embedding of Optimal Transport Plans

Add code
Jan 19, 2026
Viaarxiv icon

RLBR: Reinforcement Learning with Biasing Rewards for Contextual Speech Large Language Models

Add code
Jan 19, 2026
Viaarxiv icon

An Effective Energy Mask-based Adversarial Evasion Attacks against Misclassification in Speaker Recognition Systems

Add code
Jan 29, 2026
Viaarxiv icon

ParaMETA: Towards Learning Disentangled Paralinguistic Speaking Styles Representations from Speech

Add code
Jan 18, 2026
Viaarxiv icon

HoverAI: An Embodied Aerial Agent for Natural Human-Drone Interaction

Add code
Jan 20, 2026
Viaarxiv icon

TidyVoice: A Curated Multilingual Dataset for Speaker Verification Derived from Common Voice

Add code
Jan 22, 2026
Viaarxiv icon

RIR-Mega-Speech: A Reverberant Speech Corpus with Comprehensive Acoustic Metadata and Reproducible Evaluation

Add code
Jan 25, 2026
Viaarxiv icon