Picture for Hoirin Kim

Hoirin Kim

HuBERT-VIC: Improving Noise-Robust Automatic Speech Recognition of Speech Foundation Model via Variance-Invariance-Covariance Regularization

Add code
Aug 17, 2025
Viaarxiv icon

ParaNoise-SV: Integrated Approach for Noise-Robust Speaker Verification with Parallel Joint Learning of Speech Enhancement and Noise Extraction

Add code
Aug 10, 2025
Figure 1 for ParaNoise-SV: Integrated Approach for Noise-Robust Speaker Verification with Parallel Joint Learning of Speech Enhancement and Noise Extraction
Figure 2 for ParaNoise-SV: Integrated Approach for Noise-Robust Speaker Verification with Parallel Joint Learning of Speech Enhancement and Noise Extraction
Figure 3 for ParaNoise-SV: Integrated Approach for Noise-Robust Speaker Verification with Parallel Joint Learning of Speech Enhancement and Noise Extraction
Figure 4 for ParaNoise-SV: Integrated Approach for Noise-Robust Speaker Verification with Parallel Joint Learning of Speech Enhancement and Noise Extraction
Viaarxiv icon

Improving Cross-Lingual Phonetic Representation of Low-Resource Languages Through Language Similarity Analysis

Add code
Jan 12, 2025
Figure 1 for Improving Cross-Lingual Phonetic Representation of Low-Resource Languages Through Language Similarity Analysis
Figure 2 for Improving Cross-Lingual Phonetic Representation of Low-Resource Languages Through Language Similarity Analysis
Figure 3 for Improving Cross-Lingual Phonetic Representation of Low-Resource Languages Through Language Similarity Analysis
Figure 4 for Improving Cross-Lingual Phonetic Representation of Low-Resource Languages Through Language Similarity Analysis
Viaarxiv icon

Learning Video Temporal Dynamics with Cross-Modal Attention for Robust Audio-Visual Speech Recognition

Add code
Jul 04, 2024
Figure 1 for Learning Video Temporal Dynamics with Cross-Modal Attention for Robust Audio-Visual Speech Recognition
Figure 2 for Learning Video Temporal Dynamics with Cross-Modal Attention for Robust Audio-Visual Speech Recognition
Figure 3 for Learning Video Temporal Dynamics with Cross-Modal Attention for Robust Audio-Visual Speech Recognition
Figure 4 for Learning Video Temporal Dynamics with Cross-Modal Attention for Robust Audio-Visual Speech Recognition
Viaarxiv icon

One-Class Learning with Adaptive Centroid Shift for Audio Deepfake Detection

Add code
Jun 24, 2024
Figure 1 for One-Class Learning with Adaptive Centroid Shift for Audio Deepfake Detection
Figure 2 for One-Class Learning with Adaptive Centroid Shift for Audio Deepfake Detection
Figure 3 for One-Class Learning with Adaptive Centroid Shift for Audio Deepfake Detection
Figure 4 for One-Class Learning with Adaptive Centroid Shift for Audio Deepfake Detection
Viaarxiv icon

STaR: Distilling Speech Temporal Relation for Lightweight Speech Self-Supervised Learning Models

Add code
Dec 14, 2023
Figure 1 for STaR: Distilling Speech Temporal Relation for Lightweight Speech Self-Supervised Learning Models
Figure 2 for STaR: Distilling Speech Temporal Relation for Lightweight Speech Self-Supervised Learning Models
Figure 3 for STaR: Distilling Speech Temporal Relation for Lightweight Speech Self-Supervised Learning Models
Figure 4 for STaR: Distilling Speech Temporal Relation for Lightweight Speech Self-Supervised Learning Models
Viaarxiv icon

Recycle-and-Distill: Universal Compression Strategy for Transformer-based Speech SSL Models with Attention Map Reusing and Masking Distillation

Add code
May 19, 2023
Viaarxiv icon

Deep Metric Learning with Adaptive Margin and Adaptive Scale for Acoustic Word Discrimination

Add code
Oct 26, 2022
Viaarxiv icon

FitHuBERT: Going Thinner and Deeper for Knowledge Distillation of Speech Self-Supervised Learning

Add code
Jul 01, 2022
Figure 1 for FitHuBERT: Going Thinner and Deeper for Knowledge Distillation of Speech Self-Supervised Learning
Figure 2 for FitHuBERT: Going Thinner and Deeper for Knowledge Distillation of Speech Self-Supervised Learning
Figure 3 for FitHuBERT: Going Thinner and Deeper for Knowledge Distillation of Speech Self-Supervised Learning
Figure 4 for FitHuBERT: Going Thinner and Deeper for Knowledge Distillation of Speech Self-Supervised Learning
Viaarxiv icon

Anti-Spoofing Using Transfer Learning with Variational Information Bottleneck

Add code
Apr 04, 2022
Figure 1 for Anti-Spoofing Using Transfer Learning with Variational Information Bottleneck
Figure 2 for Anti-Spoofing Using Transfer Learning with Variational Information Bottleneck
Figure 3 for Anti-Spoofing Using Transfer Learning with Variational Information Bottleneck
Figure 4 for Anti-Spoofing Using Transfer Learning with Variational Information Bottleneck
Viaarxiv icon