Picture for Myunghun Jung

Myunghun Jung

Adversarial Deep Metric Learning for Cross-Modal Audio-Text Alignment in Open-Vocabulary Keyword Spotting

Add code
May 22, 2025
Viaarxiv icon

Text-Aware Adapter for Few-Shot Keyword Spotting

Add code
Dec 24, 2024
Figure 1 for Text-Aware Adapter for Few-Shot Keyword Spotting
Figure 2 for Text-Aware Adapter for Few-Shot Keyword Spotting
Figure 3 for Text-Aware Adapter for Few-Shot Keyword Spotting
Figure 4 for Text-Aware Adapter for Few-Shot Keyword Spotting
Viaarxiv icon

Deep Metric Learning with Adaptive Margin and Adaptive Scale for Acoustic Word Discrimination

Add code
Oct 26, 2022
Viaarxiv icon

Asymmetric Proxy Loss for Multi-View Acoustic Word Embeddings

Add code
Mar 30, 2022
Figure 1 for Asymmetric Proxy Loss for Multi-View Acoustic Word Embeddings
Figure 2 for Asymmetric Proxy Loss for Multi-View Acoustic Word Embeddings
Figure 3 for Asymmetric Proxy Loss for Multi-View Acoustic Word Embeddings
Figure 4 for Asymmetric Proxy Loss for Multi-View Acoustic Word Embeddings
Viaarxiv icon

Multi-Task Network for Noise-Robust Keyword Spotting and Speaker Verification using CTC-based Soft VAD and Global Query Attention

Add code
May 16, 2020
Figure 1 for Multi-Task Network for Noise-Robust Keyword Spotting and Speaker Verification using CTC-based Soft VAD and Global Query Attention
Figure 2 for Multi-Task Network for Noise-Robust Keyword Spotting and Speaker Verification using CTC-based Soft VAD and Global Query Attention
Figure 3 for Multi-Task Network for Noise-Robust Keyword Spotting and Speaker Verification using CTC-based Soft VAD and Global Query Attention
Figure 4 for Multi-Task Network for Noise-Robust Keyword Spotting and Speaker Verification using CTC-based Soft VAD and Global Query Attention
Viaarxiv icon

Multi-Scale Aggregation Using Feature Pyramid Module for Text-Independent Speaker Verification

Add code
Apr 14, 2020
Figure 1 for Multi-Scale Aggregation Using Feature Pyramid Module for Text-Independent Speaker Verification
Figure 2 for Multi-Scale Aggregation Using Feature Pyramid Module for Text-Independent Speaker Verification
Figure 3 for Multi-Scale Aggregation Using Feature Pyramid Module for Text-Independent Speaker Verification
Figure 4 for Multi-Scale Aggregation Using Feature Pyramid Module for Text-Independent Speaker Verification
Viaarxiv icon

Additional Shared Decoder on Siamese Multi-view Encoders for Learning Acoustic Word Embeddings

Add code
Oct 01, 2019
Figure 1 for Additional Shared Decoder on Siamese Multi-view Encoders for Learning Acoustic Word Embeddings
Figure 2 for Additional Shared Decoder on Siamese Multi-view Encoders for Learning Acoustic Word Embeddings
Figure 3 for Additional Shared Decoder on Siamese Multi-view Encoders for Learning Acoustic Word Embeddings
Figure 4 for Additional Shared Decoder on Siamese Multi-view Encoders for Learning Acoustic Word Embeddings
Viaarxiv icon

Learning acoustic word embeddings with phonetically associated triplet network

Add code
Nov 28, 2018
Figure 1 for Learning acoustic word embeddings with phonetically associated triplet network
Figure 2 for Learning acoustic word embeddings with phonetically associated triplet network
Figure 3 for Learning acoustic word embeddings with phonetically associated triplet network
Figure 4 for Learning acoustic word embeddings with phonetically associated triplet network
Viaarxiv icon