speech


HydraQE: OSU's Submission for the IWSLT 2026 Speech Translation Metrics Shared Task

Add code
Jun 07, 2026
Viaarxiv icon

Speaker-Invariant Representation Learning for Spoofing Detection via Gradient Reversal and A Variational Information Bottleneck

Add code
Jun 07, 2026
Viaarxiv icon

TRADE: Transducer-Augmented Decoder for Speech LLM

Add code
Jun 07, 2026
Viaarxiv icon

Titans-as-a-Layer: Test-Time Memory for Conversational Speech Emotion Recognition

Add code
Jun 07, 2026
Viaarxiv icon

From A to B to A: Palindromic Zero-Shot Voice Conversion with Non-Parallel Data

Add code
Jun 07, 2026
Viaarxiv icon

G-MaP-SE: Guided Speech Enhancement via GMM-Based Prior Matching

Add code
Jun 07, 2026
Viaarxiv icon

Fast and Robust On-Device Speaker Diarization: Relative Minimum Cluster Size for Stride-Accelerated Pipelines

Add code
Jun 07, 2026
Viaarxiv icon

AeroSpectra Sentinel: An Auditable LLM Prompt-Chaining Decision-Support Workflow for Acute Asthma Risk Assessment from Respiratory Sounds and Clinical Signals

Add code
Jun 06, 2026
Viaarxiv icon

Paediatric-HGNN: A Hybrid Heterogeneous Graph Neural Network for Detecting Disfluency in Children's Speech via Multiscale Acoustic Fusion

Add code
Jun 06, 2026
Viaarxiv icon

Mitigating Proxy-to-Wild Domain Gap in Deepfake Speech

Add code
Jun 05, 2026
Viaarxiv icon