speech


BengaliSent140: A Large-Scale Bengali Binary Sentiment Dataset for Hate and Non-Hate Speech Classification

Add code
Jan 27, 2026
Viaarxiv icon

T-Mimi: A Transformer-based Mimi Decoder for Real-Time On-Phone TTS

Add code
Jan 27, 2026
Viaarxiv icon

Rethinking Discrete Speech Representation Tokens for Accent Generation

Add code
Jan 27, 2026
Viaarxiv icon

MA-LipNet: Multi-Dimensional Attention Networks for Robust Lipreading

Add code
Jan 27, 2026
Viaarxiv icon

Do we really need Self-Attention for Streaming Automatic Speech Recognition?

Add code
Jan 27, 2026
Viaarxiv icon

Enhancing Speech Emotion Recognition using Dynamic Spectral Features and Kalman Smoothing

Add code
Jan 26, 2026
Viaarxiv icon

Language Family Matters: Evaluating LLM-Based ASR Across Linguistic Boundaries

Add code
Jan 26, 2026
Viaarxiv icon

MEGnifying Emotion: Sentiment Analysis from Annotated Brain Data

Add code
Jan 26, 2026
Viaarxiv icon

Geneses: Unified Generative Speech Enhancement and Separation

Add code
Jan 26, 2026
Viaarxiv icon

3DGesPolicy: Phoneme-Aware Holistic Co-Speech Gesture Generation Based on Action Control

Add code
Jan 26, 2026
Viaarxiv icon