speech


Multi-Target Backdoor Attacks Against Speaker Recognition

Add code
Aug 13, 2025
Viaarxiv icon

DualSpeechLM: Towards Unified Speech Understanding and Generation via Dual Speech Token Modeling with Large Language Models

Add code
Aug 12, 2025
Viaarxiv icon

QAMRO: Quality-aware Adaptive Margin Ranking Optimization for Human-aligned Assessment of Audio Generation Systems

Add code
Aug 12, 2025
Viaarxiv icon

Munsit at NADI 2025 Shared Task 2: Pushing the Boundaries of Multidialectal Arabic ASR with Weakly Supervised Pretraining and Continual Supervised Fine-tuning

Add code
Aug 12, 2025
Viaarxiv icon

MultiAiTutor: Child-Friendly Educational Multilingual Speech Generation Tutor with LLMs

Add code
Aug 12, 2025
Viaarxiv icon

Revealing the Role of Audio Channels in ASR Performance Degradation

Add code
Aug 12, 2025
Viaarxiv icon

Robot can reduce superior's dominance in group discussions with human social hierarchy

Add code
Aug 12, 2025
Viaarxiv icon

Transient Noise Removal via Diffusion-based Speech Inpainting

Add code
Aug 12, 2025
Viaarxiv icon

Selection of Layers from Self-supervised Learning Models for Predicting Mean-Opinion-Score of Speech

Add code
Aug 12, 2025
Viaarxiv icon

DeCRED: Decoder-Centric Regularization for Encoder-Decoder Based Speech Recognition

Add code
Aug 12, 2025
Viaarxiv icon