speech


Cost Analysis of Human-corrected Transcription for Predominately Oral Languages

Add code
Oct 14, 2025
Viaarxiv icon

Beating Harmful Stereotypes Through Facts: RAG-based Counter-speech Generation

Add code
Oct 14, 2025
Viaarxiv icon

Structured Sparsity and Weight-adaptive Pruning for Memory and Compute efficient Whisper models

Add code
Oct 14, 2025
Viaarxiv icon

Content Anonymization for Privacy in Long-form Audio

Add code
Oct 14, 2025
Viaarxiv icon

I-DCCRN-VAE: An Improved Deep Representation Learning Framework for Complex VAE-based Single-channel Speech Enhancement

Add code
Oct 14, 2025
Viaarxiv icon

A Phase Synthesizer for Decorrelation to Improve Acoustic Feedback Cancellation

Add code
Oct 14, 2025
Viaarxiv icon

A Study of the Removability of Speaker-Adversarial Perturbations

Add code
Oct 10, 2025
Figure 1 for A Study of the Removability of Speaker-Adversarial Perturbations
Figure 2 for A Study of the Removability of Speaker-Adversarial Perturbations
Figure 3 for A Study of the Removability of Speaker-Adversarial Perturbations
Figure 4 for A Study of the Removability of Speaker-Adversarial Perturbations
Viaarxiv icon

Accent-Invariant Automatic Speech Recognition via Saliency-Driven Spectrogram Masking

Add code
Oct 10, 2025
Viaarxiv icon

The Speech-LLM Takes It All: A Truly Fully End-to-End Spoken Dialogue State Tracking Approach

Add code
Oct 10, 2025
Figure 1 for The Speech-LLM Takes It All: A Truly Fully End-to-End Spoken Dialogue State Tracking Approach
Figure 2 for The Speech-LLM Takes It All: A Truly Fully End-to-End Spoken Dialogue State Tracking Approach
Figure 3 for The Speech-LLM Takes It All: A Truly Fully End-to-End Spoken Dialogue State Tracking Approach
Figure 4 for The Speech-LLM Takes It All: A Truly Fully End-to-End Spoken Dialogue State Tracking Approach
Viaarxiv icon

Estimating Brain Activity with High Spatial and Temporal Resolution using a Naturalistic MEG-fMRI Encoding Model

Add code
Oct 10, 2025
Figure 1 for Estimating Brain Activity with High Spatial and Temporal Resolution using a Naturalistic MEG-fMRI Encoding Model
Figure 2 for Estimating Brain Activity with High Spatial and Temporal Resolution using a Naturalistic MEG-fMRI Encoding Model
Figure 3 for Estimating Brain Activity with High Spatial and Temporal Resolution using a Naturalistic MEG-fMRI Encoding Model
Figure 4 for Estimating Brain Activity with High Spatial and Temporal Resolution using a Naturalistic MEG-fMRI Encoding Model
Viaarxiv icon