speech


Teaching the Teachers: Boosting unsupervised domain adaptation in speech recognition by ensemble update

Add code
Apr 13, 2026
Viaarxiv icon

LiveGesture Streamable Co-Speech Gesture Generation Model

Add code
Apr 13, 2026
Viaarxiv icon

Empowering Video Translation using Multimodal Large Language Models

Add code
Apr 13, 2026
Viaarxiv icon

Efficient Training for Cross-lingual Speech Language Models

Add code
Apr 13, 2026
Viaarxiv icon

HumDial-EIBench: A Human-Recorded Multi-Turn Emotional Intelligence Benchmark for Audio Language Models

Add code
Apr 13, 2026
Viaarxiv icon

ActorMind: Emulating Human Actor Reasoning for Speech Role-Playing

Add code
Apr 13, 2026
Viaarxiv icon

Speech-preserving active noise control: a deep learning approach in reverberant environments

Add code
Apr 13, 2026
Viaarxiv icon

DialogueSidon: Recovering Full-Duplex Dialogue Tracks from In-the-Wild Dialogue Audio

Add code
Apr 13, 2026
Viaarxiv icon

Saar-Voice: A Multi-Speaker Saarbrücken Dialect Speech Corpus

Add code
Apr 13, 2026
Viaarxiv icon

MEME-Fusion@CHiPSAL 2026: Multimodal Ablation Study of Hate Detection and Sentiment Analysis on Nepali Memes

Add code
Apr 13, 2026
Viaarxiv icon