speech


ASPIRin: Action Space Projection for Interactivity-Optimized Reinforcement Learning in Full-Duplex Speech Language Models

Add code
Apr 11, 2026
Viaarxiv icon

Phonemes vs. Projectors: An Investigation of Speech-Language Interfaces for LLM-based ASR

Add code
Apr 10, 2026
Viaarxiv icon

AudioGuard: Toward Comprehensive Audio Safety Protection Across Diverse Threat Models

Add code
Apr 10, 2026
Viaarxiv icon

Toward using Speech to Sense Student Emotion in Remote Learning Environments

Add code
Apr 10, 2026
Viaarxiv icon

Data Selection Effects on Self-Supervised Learning of Audio Representations for French Audiovisual Broadcasts

Add code
Apr 10, 2026
Viaarxiv icon

Identification and Anonymization of Named Entities in Unstructured Information Sources for Use in Social Engineering Detection

Add code
Apr 10, 2026
Viaarxiv icon

Few-Shot Contrastive Adaptation for Audio Abuse Detection in Low-Resource Indic Languages

Add code
Apr 10, 2026
Viaarxiv icon

GRM: Utility-Aware Jailbreak Attacks on Audio LLMs via Gradient-Ratio Masking

Add code
Apr 10, 2026
Viaarxiv icon

DDSP-QbE++: Improving Speech Quality for Speech Anonymisation for Atypical Speech

Add code
Apr 10, 2026
Viaarxiv icon

Regularized Entropy Information Adaptation with Temporal-Awareness Networks for Simultaneous Speech Translation

Add code
Apr 10, 2026
Viaarxiv icon