Picture for Ha-Jin Yu

Ha-Jin Yu

SV-Mixer: Replacing the Transformer Encoder with Lightweight MLPs for Self-Supervised Model Compression in Speaker Verification

Add code
Sep 17, 2025
Viaarxiv icon

Token-based Attractors and Cross-attention in Spoof Diarization

Add code
Sep 16, 2025
Viaarxiv icon

SEED: Speaker Embedding Enhancement Diffusion Model

Add code
May 22, 2025
Viaarxiv icon

MR-RawNet: Speaker verification system with multiple temporal resolutions for variable duration utterances using raw waveforms

Add code
Jun 11, 2024
Figure 1 for MR-RawNet: Speaker verification system with multiple temporal resolutions for variable duration utterances using raw waveforms
Figure 2 for MR-RawNet: Speaker verification system with multiple temporal resolutions for variable duration utterances using raw waveforms
Figure 3 for MR-RawNet: Speaker verification system with multiple temporal resolutions for variable duration utterances using raw waveforms
Figure 4 for MR-RawNet: Speaker verification system with multiple temporal resolutions for variable duration utterances using raw waveforms
Viaarxiv icon

NM-FlowGAN: Modeling sRGB Noise with a Hybrid Approach based on Normalizing Flows and Generative Adversarial Networks

Add code
Dec 15, 2023
Figure 1 for NM-FlowGAN: Modeling sRGB Noise with a Hybrid Approach based on Normalizing Flows and Generative Adversarial Networks
Figure 2 for NM-FlowGAN: Modeling sRGB Noise with a Hybrid Approach based on Normalizing Flows and Generative Adversarial Networks
Figure 3 for NM-FlowGAN: Modeling sRGB Noise with a Hybrid Approach based on Normalizing Flows and Generative Adversarial Networks
Figure 4 for NM-FlowGAN: Modeling sRGB Noise with a Hybrid Approach based on Normalizing Flows and Generative Adversarial Networks
Viaarxiv icon

HM-Conformer: A Conformer-based audio deepfake detection system with hierarchical pooling and multi-level classification token aggregation methods

Add code
Sep 15, 2023
Figure 1 for HM-Conformer: A Conformer-based audio deepfake detection system with hierarchical pooling and multi-level classification token aggregation methods
Figure 2 for HM-Conformer: A Conformer-based audio deepfake detection system with hierarchical pooling and multi-level classification token aggregation methods
Figure 3 for HM-Conformer: A Conformer-based audio deepfake detection system with hierarchical pooling and multi-level classification token aggregation methods
Figure 4 for HM-Conformer: A Conformer-based audio deepfake detection system with hierarchical pooling and multi-level classification token aggregation methods
Viaarxiv icon

Diff-SV: A Unified Hierarchical Framework for Noise-Robust Speaker Verification Using Score-Based Diffusion Probabilistic Models

Add code
Sep 14, 2023
Figure 1 for Diff-SV: A Unified Hierarchical Framework for Noise-Robust Speaker Verification Using Score-Based Diffusion Probabilistic Models
Figure 2 for Diff-SV: A Unified Hierarchical Framework for Noise-Robust Speaker Verification Using Score-Based Diffusion Probabilistic Models
Figure 3 for Diff-SV: A Unified Hierarchical Framework for Noise-Robust Speaker Verification Using Score-Based Diffusion Probabilistic Models
Figure 4 for Diff-SV: A Unified Hierarchical Framework for Noise-Robust Speaker Verification Using Score-Based Diffusion Probabilistic Models
Viaarxiv icon

PAS: Partial Additive Speech Data Augmentation Method for Noise Robust Speaker Verification

Add code
Jul 20, 2023
Figure 1 for PAS: Partial Additive Speech Data Augmentation Method for Noise Robust Speaker Verification
Figure 2 for PAS: Partial Additive Speech Data Augmentation Method for Noise Robust Speaker Verification
Figure 3 for PAS: Partial Additive Speech Data Augmentation Method for Noise Robust Speaker Verification
Viaarxiv icon

One-Step Knowledge Distillation and Fine-Tuning in Using Large Pre-Trained Self-Supervised Learning Models for Speaker Verification

Add code
Jun 08, 2023
Figure 1 for One-Step Knowledge Distillation and Fine-Tuning in Using Large Pre-Trained Self-Supervised Learning Models for Speaker Verification
Figure 2 for One-Step Knowledge Distillation and Fine-Tuning in Using Large Pre-Trained Self-Supervised Learning Models for Speaker Verification
Figure 3 for One-Step Knowledge Distillation and Fine-Tuning in Using Large Pre-Trained Self-Supervised Learning Models for Speaker Verification
Figure 4 for One-Step Knowledge Distillation and Fine-Tuning in Using Large Pre-Trained Self-Supervised Learning Models for Speaker Verification
Viaarxiv icon

SS-BSN: Attentive Blind-Spot Network for Self-Supervised Denoising with Nonlocal Self-Similarity

Add code
May 17, 2023
Figure 1 for SS-BSN: Attentive Blind-Spot Network for Self-Supervised Denoising with Nonlocal Self-Similarity
Figure 2 for SS-BSN: Attentive Blind-Spot Network for Self-Supervised Denoising with Nonlocal Self-Similarity
Figure 3 for SS-BSN: Attentive Blind-Spot Network for Self-Supervised Denoising with Nonlocal Self-Similarity
Figure 4 for SS-BSN: Attentive Blind-Spot Network for Self-Supervised Denoising with Nonlocal Self-Similarity
Viaarxiv icon