Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yibo Bai

MDD: a Mask Diffusion Detector to Protect Speaker Verification Systems from Adversarial Perturbations

Aug 26, 2025

Yibo Bai, Sizhou Chen, Michele Panariello, Xiao-Lei Zhang, Massimiliano Todisco, Nicholas Evans

Figure 1 for MDD: a Mask Diffusion Detector to Protect Speaker Verification Systems from Adversarial Perturbations

Figure 2 for MDD: a Mask Diffusion Detector to Protect Speaker Verification Systems from Adversarial Perturbations

Figure 3 for MDD: a Mask Diffusion Detector to Protect Speaker Verification Systems from Adversarial Perturbations

Figure 4 for MDD: a Mask Diffusion Detector to Protect Speaker Verification Systems from Adversarial Perturbations

Abstract:Speaker verification systems are increasingly deployed in security-sensitive applications but remain highly vulnerable to adversarial perturbations. In this work, we propose the Mask Diffusion Detector (MDD), a novel adversarial detection and purification framework based on a \textit{text-conditioned masked diffusion model}. During training, MDD applies partial masking to Mel-spectrograms and progressively adds noise through a forward diffusion process, simulating the degradation of clean speech features. A reverse process then reconstructs the clean representation conditioned on the input transcription. Unlike prior approaches, MDD does not require adversarial examples or large-scale pretraining. Experimental results show that MDD achieves strong adversarial detection performance and outperforms prior state-of-the-art methods, including both diffusion-based and neural codec-based approaches. Furthermore, MDD effectively purifies adversarially-manipulated speech, restoring speaker verification performance to levels close to those observed under clean conditions. These findings demonstrate the potential of diffusion-based masking strategies for secure and reliable speaker verification systems.

* Accepted by APSIPA ASC 2025

Via

Access Paper or Ask Questions

Diffusion-Based Adversarial Purification for Speaker Verification

Oct 24, 2023

Yibo Bai, Xiao-Lei Zhang

Figure 1 for Diffusion-Based Adversarial Purification for Speaker Verification

Figure 2 for Diffusion-Based Adversarial Purification for Speaker Verification

Figure 3 for Diffusion-Based Adversarial Purification for Speaker Verification

Figure 4 for Diffusion-Based Adversarial Purification for Speaker Verification

Abstract:Recently, automatic speaker verification (ASV) based on deep learning is easily contaminated by adversarial attacks, which is a new type of attack that injects imperceptible perturbations to audio signals so as to make ASV produce wrong decisions. This poses a significant threat to the security and reliability of ASV systems. To address this issue, we propose a Diffusion-Based Adversarial Purification (DAP) method that enhances the robustness of ASV systems against such adversarial attacks. Our method leverages a conditional denoising diffusion probabilistic model to effectively purify the adversarial examples and mitigate the impact of perturbations. DAP first introduces controlled noise into adversarial examples, and then performs a reverse denoising process to reconstruct clean audio. Experimental results demonstrate the efficacy of the proposed DAP in enhancing the security of ASV and meanwhile minimizing the distortion of the purified audio signals.

Via

Access Paper or Ask Questions