speech


Audio--Image Alignment as a Continued-Pretraining Stage Improves Low-Resource ASR

Add code
Jun 23, 2026
Viaarxiv icon

Joint Learning of Covariance Estimation and White Noise Gain for Robust MVDR Beamforming

Add code
Jun 23, 2026
Viaarxiv icon

Breaking Shortcut Learning for Cross-Trial EEG-Guided Target Speech Extraction via Two-Stage Training

Add code
Jun 23, 2026
Viaarxiv icon

ParaPairAudioBench: Paralinguistic Pairwise Audio Benchmark for LALM-as-a-Judge

Add code
Jun 23, 2026
Viaarxiv icon

VieSpeaker: A Large-Scale Vietnamese Speaker Recognition Dataset Beyond Visual Dependency

Add code
Jun 23, 2026
Viaarxiv icon

The Watermark Shortcut: How Provenance Marking Sabotages Audio Deepfake Detection

Add code
Jun 22, 2026
Viaarxiv icon

Don't Listen to Me: A Lightweight, Low-Latency Model for Own-Voice Cancellation in Far-Field Speech Enhancement

Add code
Jun 22, 2026
Viaarxiv icon

On the Effect of Segmentation Width and Cluster Size on Speech Resynthesis and Continuation in Generative Spoken Language Models

Add code
Jun 22, 2026
Viaarxiv icon

Acoustic Landmark Detector based on Conformer and HuBERT

Add code
Jun 22, 2026
Viaarxiv icon

An Acoustic Landmark Database of the English Lexicon via Articulatory Synthesis

Add code
Jun 22, 2026
Viaarxiv icon