speech


Covo-Audio Technical Report

Add code
Feb 10, 2026
Viaarxiv icon

AUHead: Realistic Emotional Talking Head Generation via Action Units Control

Add code
Feb 10, 2026
Viaarxiv icon

Where Are We At with Automatic Speech Recognition for the Bambara Language?

Add code
Feb 10, 2026
Viaarxiv icon

Sparse Axonal and Dendritic Delays Enable Competitive SNNs for Keyword Classification

Add code
Feb 10, 2026
Viaarxiv icon

Unsupervised Cross-Lingual Part-of-Speech Tagging with Monolingual Corpora Only

Add code
Feb 10, 2026
Viaarxiv icon

When Less Is More? Diagnosing ASR Predictions in Sardinian via Layer-Wise Decoding

Add code
Feb 10, 2026
Viaarxiv icon

Emotion-Coherent Speech Data Augmentation and Self-Supervised Contrastive Style Training for Enhancing Kids's Story Speech Synthesis

Add code
Feb 10, 2026
Viaarxiv icon

PTS-SNN: A Prompt-Tuned Temporal Shift Spiking Neural Networks for Efficient Speech Emotion Recognition

Add code
Feb 09, 2026
Viaarxiv icon

Cross-Modal Bottleneck Fusion For Noise Robust Audio-Visual Speech Recognition

Add code
Feb 09, 2026
Viaarxiv icon

VocalNet-MDM: Accelerating Streaming Speech LLM via Self-Distilled Masked Diffusion Modeling

Add code
Feb 09, 2026
Viaarxiv icon