speech


Multi-Distillation from Speech and Music Representation Models

Add code
Jun 08, 2025
Viaarxiv icon

E-BATS: Efficient Backpropagation-Free Test-Time Adaptation for Speech Foundation Models

Add code
Jun 08, 2025
Viaarxiv icon

Speech Recognition on TV Series with Video-guided Post-Correction

Add code
Jun 08, 2025
Viaarxiv icon

Streaming Endpointer for Spoken Dialogue using Neural Audio Codecs and Label-Delayed Training

Add code
Jun 08, 2025
Viaarxiv icon

"In This Environment, As That Speaker": A Text-Driven Framework for Multi-Attribute Speech Conversion

Add code
Jun 08, 2025
Viaarxiv icon

Technical Report: A Practical Guide to Kaldi ASR Optimization

Add code
Jun 08, 2025
Viaarxiv icon

Rhythm Features for Speaker Identification

Add code
Jun 07, 2025
Viaarxiv icon

SynHate: Detecting Hate Speech in Synthetic Deepfake Audio

Add code
Jun 07, 2025
Viaarxiv icon

Accurate analysis of the pitch pulse-based magnitude/phase structure of natural vowels and assessment of three lightweight time/frequency voicing restoration methods

Add code
Jun 07, 2025
Viaarxiv icon

Automatic Speech Recognition of African American English: Lexical and Contextual Effects

Add code
Jun 07, 2025
Viaarxiv icon