speech


Prosodic ABX: A Language-Agnostic Method for Measuring Prosodic Contrast in Speech Representations

Add code
Apr 02, 2026
Viaarxiv icon

T5Gemma-TTS Technical Report

Add code
Apr 02, 2026
Viaarxiv icon

Development and multi-center evaluation of domain-adapted speech recognition for human-AI teaming in real-world gastrointestinal endoscopy

Add code
Apr 02, 2026
Viaarxiv icon

Robust Pitch Estimation and Tracking for Speakers Based on Subband Encoding and the Generalized Labeled Multi-Bernoulli Filter

Add code
Apr 02, 2026
Viaarxiv icon

CV-18 NER: Augmented Common Voice for Named Entity Recognition from Arabic Speech

Add code
Apr 02, 2026
Viaarxiv icon

Tracking the emergence of linguistic structure in self-supervised models learning from speech

Add code
Apr 02, 2026
Viaarxiv icon

OmniVoice: Towards Omnilingual Zero-Shot Text-to-Speech with Diffusion Language Models

Add code
Apr 02, 2026
Viaarxiv icon

Low-Burden LLM-Based Preference Learning: Personalizing Assistive Robots from Natural Language Feedback for Users with Paralysis

Add code
Apr 01, 2026
Viaarxiv icon

Evolutionary Multi-Objective Fusion of Deepfake Speech Detectors

Add code
Apr 01, 2026
Viaarxiv icon

VisG AV-HuBERT: Viseme-Guided AV-HuBERT

Add code
Apr 01, 2026
Viaarxiv icon