speech


MIBURI: Towards Expressive Interactive Gesture Synthesis

Add code
Mar 03, 2026
Viaarxiv icon

An Investigation Into Various Approaches For Bengali Long-Form Speech Transcription and Bengali Speaker Diarization

Add code
Mar 03, 2026
Viaarxiv icon

Differentiable Time-Varying IIR Filtering for Real-Time Speech Denoising

Add code
Mar 03, 2026
Viaarxiv icon

Single Microphone Own Voice Detection based on Simulated Transfer Functions for Hearing Aids

Add code
Mar 03, 2026
Viaarxiv icon

HateMirage: An Explainable Multi-Dimensional Dataset for Decoding Faux Hate and Subtle Online Abuse

Add code
Mar 03, 2026
Viaarxiv icon

CodecFlow: Efficient Bandwidth Extension via Conditional Flow Matching in Neural Codec Latent Space

Add code
Mar 03, 2026
Viaarxiv icon

Using Songs to Improve Kazakh Automatic Speech Recognition

Add code
Mar 03, 2026
Viaarxiv icon

Interpreting Speaker Characteristics in the Dimensions of Self-Supervised Speech Features

Add code
Mar 03, 2026
Viaarxiv icon

Bias and Fairness in Self-Supervised Acoustic Representations for Cognitive Impairment Detection

Add code
Mar 03, 2026
Viaarxiv icon

Does Fine-tuning by Reinforcement Learning Improve Generalization in Binary Speech Deepfake Detection?

Add code
Mar 03, 2026
Viaarxiv icon