speech


Vortex: Multi-Modal Fusion System for Intelligent Video Retrieval

Add code
Jun 18, 2026
Viaarxiv icon

Segment-Level Mandarin Chinese Speech-Based Cognitive Impairment Detection via an Autoencoder with Contrastive Learning

Add code
Jun 18, 2026
Viaarxiv icon

ReNikud: Audio-Supervised Hebrew Grapheme-to-Phoneme Conversion

Add code
Jun 18, 2026
Viaarxiv icon

Repurposing a Speech Classifier for Guided Diffusion-Based Speech Generation

Add code
Jun 18, 2026
Viaarxiv icon

LLM-Based Synthetic Ground Truth Generation for Audio-Based Emotion Classification via In-Context Learning

Add code
Jun 18, 2026
Viaarxiv icon

Joycent: Diffusion-based Accent TTS without Accented Phone Prediction

Add code
Jun 18, 2026
Viaarxiv icon

Time-Unconditional Generative Speech Enhancement via Autonomous Rectified Flow

Add code
Jun 18, 2026
Viaarxiv icon

PrefSQA: Pairwise Preference Prediction for Speech Quality Assessment and the Critical Role of High Quality Datasets

Add code
Jun 17, 2026
Viaarxiv icon

A BART-based approach with hierarchical strategy for Vietnamese abstractive multi-document summarization

Add code
Jun 17, 2026
Viaarxiv icon

FlowFake: Liquid Networks for Audio Deepfake Detection

Add code
Jun 17, 2026
Viaarxiv icon