Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Gangin Park

SEED: Speaker Embedding Enhancement Diffusion Model

May 22, 2025

KiHyun Nam, Jungwoo Heo, Jee-weon Jung, Gangin Park, Chaeyoung Jung, Ha-Jin Yu, Joon Son Chung

Abstract:A primary challenge when deploying speaker recognition systems in real-world applications is performance degradation caused by environmental mismatch. We propose a diffusion-based method that takes speaker embeddings extracted from a pre-trained speaker recognition model and generates refined embeddings. For training, our approach progressively adds Gaussian noise to both clean and noisy speaker embeddings extracted from clean and noisy speech, respectively, via forward process of a diffusion model, and then reconstructs them to clean embeddings in the reverse process. While inferencing, all embeddings are regenerated via diffusion process. Our method needs neither speaker label nor any modification to the existing speaker recognition pipeline. Experiments on evaluation sets simulating environment mismatch scenarios show that our method can improve recognition accuracy by up to 19.6% over baseline models while retaining performance on conventional scenarios. We publish our code here https://github.com/kaistmm/seed-pytorch

* Accepted to Interspeech 2025. The official code can be found at https://github.com/kaistmm/seed-pytorch

Via

Access Paper or Ask Questions

N-ImageNet: Towards Robust, Fine-Grained Object Recognition with Event Cameras

Dec 02, 2021

Junho Kim, Jaehyeok Bae, Gangin Park, Young Min Kim

Figure 1 for N-ImageNet: Towards Robust, Fine-Grained Object Recognition with Event Cameras

Figure 2 for N-ImageNet: Towards Robust, Fine-Grained Object Recognition with Event Cameras

Figure 3 for N-ImageNet: Towards Robust, Fine-Grained Object Recognition with Event Cameras

Figure 4 for N-ImageNet: Towards Robust, Fine-Grained Object Recognition with Event Cameras

Abstract:We introduce N-ImageNet, a large-scale dataset targeted for robust, fine-grained object recognition with event cameras. The dataset is collected using programmable hardware in which an event camera consistently moves around a monitor displaying images from ImageNet. N-ImageNet serves as a challenging benchmark for event-based object recognition, due to its large number of classes and samples. We empirically show that pretraining on N-ImageNet improves the performance of event-based classifiers and helps them learn with few labeled data. In addition, we present several variants of N-ImageNet to test the robustness of event-based classifiers under diverse camera trajectories and severe lighting conditions, and propose a novel event representation to alleviate the performance degradation. To the best of our knowledge, we are the first to quantitatively investigate the consequences caused by various environmental conditions on event-based object recognition algorithms. N-ImageNet and its variants are expected to guide practical implementations for deploying event-based object recognition algorithms in the real world.

* Accepted to ICCV 2021

Via

Access Paper or Ask Questions