Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Zhihua Fang

Hyperbolic Additive Margin Softmax with Hierarchical Information for Speaker Verification

Jan 27, 2026

Zhihua Fang, Liang He

Abstract:Speaker embedding learning based on Euclidean space has achieved significant progress, but it is still insufficient in modeling hierarchical information within speaker features. Hyperbolic space, with its negative curvature geometric properties, can efficiently represent hierarchical information within a finite volume, making it more suitable for the feature distribution of speaker embeddings. In this paper, we propose Hyperbolic Softmax (H-Softmax) and Hyperbolic Additive Margin Softmax (HAM-Softmax) based on hyperbolic space. H-Softmax incorporates hierarchical information into speaker embeddings by projecting embeddings and speaker centers into hyperbolic space and computing hyperbolic distances. HAM-Softmax further enhances inter-class separability by introducing margin constraint on this basis. Experimental results show that H-Softmax and HAM-Softmax achieve average relative EER reductions of 27.84% and 14.23% compared with standard Softmax and AM-Softmax, respectively, demonstrating that the proposed methods effectively improve speaker verification performance and at the same time preserve the capability of hierarchical structure modeling. The code will be released at https://github.com/PunkMale/HAM-Softmax.

* 5 pages, 3 figures, Accepted at ICASSP 2026

Via

Access Paper or Ask Questions

Noise Supervised Contrastive Learning and Feature-Perturbed for Anomalous Sound Detection

Sep 18, 2025

Shun Huang, Zhihua Fang, Liang He

Abstract:Unsupervised anomalous sound detection aims to detect unknown anomalous sounds by training a model using only normal audio data. Despite advancements in self-supervised methods, the issue of frequent false alarms when handling samples of the same type from different machines remains unresolved. This paper introduces a novel training technique called one-stage supervised contrastive learning (OS-SCL), which significantly addresses this problem by perturbing features in the embedding space and employing a one-stage noisy supervised contrastive learning approach. On the DCASE 2020 Challenge Task 2, it achieved 94.64\% AUC, 88.42\% pAUC, and 89.24\% mAUC using only Log-Mel features. Additionally, a time-frequency feature named TFgram is proposed, which is extracted from raw audio. This feature effectively captures critical information for anomalous sound detection, ultimately achieving 95.71\% AUC, 90.23\% pAUC, and 91.23\% mAUC. The source code is available at: \underline{www.github.com/huangswt/OS-SCL}.

* Accepted ICASSP 2025

Via

Access Paper or Ask Questions

OR-Gate: A Noisy Label Filtering Method for Speaker Verification

Nov 22, 2022

Zhihua Fang, Hanhan Ma, Lin Li, Liang He

Figure 1 for OR-Gate: A Noisy Label Filtering Method for Speaker Verification

Figure 2 for OR-Gate: A Noisy Label Filtering Method for Speaker Verification

Figure 3 for OR-Gate: A Noisy Label Filtering Method for Speaker Verification

Figure 4 for OR-Gate: A Noisy Label Filtering Method for Speaker Verification

Abstract:The deep learning models used for speaker verification are heavily dependent on large-scale data and correct labels. However, noisy (wrong) labels often occur, which deteriorates the system's performance. Unfortunately, there are relatively few studies in this area. In this paper, we propose a method to gradually filter noisy labels out at the training stage. We compare the network predictions at different training epochs with ground-truth labels, and select reliable (considered correct) labels by using the OR gate mechanism like that in logic circuits. Therefore, our proposed method is named as OR-Gate. We experimentally demonstrated that the OR-Gate can effectively filter noisy labels out and has excellent performance.

* Submitted to 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2023)

Via

Access Paper or Ask Questions