speech


Privacy-Preserving End-to-End Full-Duplex Speech Dialogue Models

Add code
Mar 09, 2026
Viaarxiv icon

Targeted Speaker Poisoning Framework in Zero-Shot Text-to-Speech

Add code
Mar 08, 2026
Viaarxiv icon

StyleBench: Evaluating Speech Language Models on Conversational Speaking Style Control

Add code
Mar 08, 2026
Viaarxiv icon

EmbedTalk: Triplane-Free Talking Head Synthesis using Embedding-Driven Gaussian Deformation

Add code
Mar 08, 2026
Viaarxiv icon

VoiceSHIELD-Small: Real-Time Malicious Speech Detection and Transcription

Add code
Mar 08, 2026
Viaarxiv icon

Nwāchā Munā: A Devanagari Speech Corpus and Proximal Transfer Benchmark for Nepal Bhasha ASR

Add code
Mar 08, 2026
Viaarxiv icon

Exploring the potential and limitations of Model Merging for Multi-Domain Adaptation in ASR

Add code
Mar 05, 2026
Viaarxiv icon

Oral to Web: Digitizing 'Zero Resource'Languages of Bangladesh

Add code
Mar 05, 2026
Viaarxiv icon

Boosting ASR Robustness via Test-Time Reinforcement Learning with Audio-Text Semantic Rewards

Add code
Mar 05, 2026
Viaarxiv icon

When Denoising Hinders: Revisiting Zero-Shot ASR with SAM-Audio and Whisper

Add code
Mar 05, 2026
Viaarxiv icon