speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

WESR: Scaling and Evaluating Word-level Event-Speech Recognition

Add code
Jan 08, 2026
Viaarxiv icon

LLMs-Integrated Automatic Hate Speech Recognition Using Controllable Text Generation Models

Add code
Jan 08, 2026
Viaarxiv icon

Towards Comprehensive Semantic Speech Embeddings for Chinese Dialects

Add code
Jan 12, 2026
Viaarxiv icon

Robust CAPTCHA Using Audio Illusions in the Era of Large Language Models: from Evaluation to Advances

Add code
Jan 13, 2026
Viaarxiv icon

SLAM-LLM: A Modular, Open-Source Multimodal Large Language Model Framework and Best Practice for Speech, Language, Audio and Music Processing

Add code
Jan 14, 2026
Viaarxiv icon

Multi-channel multi-speaker transformer for speech recognition

Add code
Jan 06, 2026
Viaarxiv icon

ParaMETA: Towards Learning Disentangled Paralinguistic Speaking Styles Representations from Speech

Add code
Jan 18, 2026
Viaarxiv icon

Task Arithmetic with Support Languages for Low-Resource ASR

Add code
Jan 11, 2026
Viaarxiv icon

Variational decomposition autoencoding improves disentanglement of latent representations

Add code
Jan 11, 2026
Viaarxiv icon

An Intelligent AI glasses System with Multi-Agent Architecture for Real-Time Voice Processing and Task Execution

Add code
Jan 09, 2026
Viaarxiv icon