speech


Explainable AI: Context-Aware Layer-Wise Integrated Gradients for Explaining Transformer Models

Add code
Feb 18, 2026
Viaarxiv icon

Scaling Open Discrete Audio Foundation Models with Interleaved Semantic, Acoustic, and Text Tokens

Add code
Feb 18, 2026
Viaarxiv icon

Supercharging Agenda Setting Research: The ParlaCAP Dataset of 28 European Parliaments and a Scalable Multilingual LLM-Based Classification

Add code
Feb 18, 2026
Viaarxiv icon

How to Label Resynthesized Audio: The Dual Role of Neural Audio Codecs in Audio Deepfake Detection

Add code
Feb 18, 2026
Viaarxiv icon

Color-based Emotion Representation for Speech Emotion Recognition

Add code
Feb 18, 2026
Viaarxiv icon

LLM-to-Speech: A Synthetic Data Pipeline for Training Dialectal Text-to-Speech Models

Add code
Feb 17, 2026
Viaarxiv icon

The Equalizer: Introducing Shape-Gain Decomposition in Neural Audio Codecs

Add code
Feb 17, 2026
Viaarxiv icon

Bottleneck Transformer-Based Approach for Improved Automatic STOI Score Prediction

Add code
Feb 17, 2026
Viaarxiv icon

UniTAF: A Modular Framework for Joint Text-to-Speech and Audio-to-Face Modeling

Add code
Feb 17, 2026
Viaarxiv icon

Enroll-on-Wakeup: A First Comparative Study of Target Speech Extraction for Seamless Interaction in Real Noisy Human-Machine Dialogue Scenarios

Add code
Feb 17, 2026
Viaarxiv icon