speech


MAPSS: Manifold-based Assessment of Perceptual Source Separation

Add code
Sep 11, 2025
Viaarxiv icon

HISPASpoof: A New Dataset For Spanish Speech Forensics

Add code
Sep 11, 2025
Viaarxiv icon

Deploying AI for Signal Processing education: Selected challenges and intriguing opportunities

Add code
Sep 10, 2025
Viaarxiv icon

TextlessRAG: End-to-End Visual Document RAG by Speech Without Text

Add code
Sep 10, 2025
Viaarxiv icon

Machine Learning-Based Prediction of Speech Arrest During Direct Cortical Stimulation Mapping

Add code
Sep 10, 2025
Viaarxiv icon

Streaming Sequence-to-Sequence Learning with Delayed Streams Modeling

Add code
Sep 10, 2025
Viaarxiv icon

Joint Learning using Mixture-of-Expert-Based Representation for Enhanced Speech Generation and Robust Emotion Recognition

Add code
Sep 10, 2025
Viaarxiv icon

LatentVoiceGrad: Nonparallel Voice Conversion with Latent Diffusion/Flow-Matching Models

Add code
Sep 10, 2025
Viaarxiv icon

Accelerating Diffusion Transformer-Based Text-to-Speech with Transformer Layer Caching

Add code
Sep 10, 2025
Viaarxiv icon

CommonVoice-SpeechRE and RPG-MoGe: Advancing Speech Relation Extraction with a New Dataset and Multi-Order Generative Framework

Add code
Sep 10, 2025
Viaarxiv icon