Picture for Youngmoon Jung

Youngmoon Jung

CTC-aligned Audio-Text Embedding for Streaming Open-vocabulary Keyword Spotting

Add code
Jun 12, 2024
Viaarxiv icon

Relational Proxy Loss for Audio-Text based Keyword Spotting

Add code
Jun 08, 2024
Viaarxiv icon

FitHuBERT: Going Thinner and Deeper for Knowledge Distillation of Speech Self-Supervised Learning

Add code
Jul 01, 2022
Figure 1 for FitHuBERT: Going Thinner and Deeper for Knowledge Distillation of Speech Self-Supervised Learning
Figure 2 for FitHuBERT: Going Thinner and Deeper for Knowledge Distillation of Speech Self-Supervised Learning
Figure 3 for FitHuBERT: Going Thinner and Deeper for Knowledge Distillation of Speech Self-Supervised Learning
Figure 4 for FitHuBERT: Going Thinner and Deeper for Knowledge Distillation of Speech Self-Supervised Learning
Viaarxiv icon

Perceptually Guided End-to-End Text-to-Speech

Add code
Nov 02, 2020
Figure 1 for Perceptually Guided End-to-End Text-to-Speech
Figure 2 for Perceptually Guided End-to-End Text-to-Speech
Figure 3 for Perceptually Guided End-to-End Text-to-Speech
Viaarxiv icon

A Unified Deep Learning Framework for Short-Duration Speaker Verification in Adverse Environments

Add code
Oct 06, 2020
Figure 1 for A Unified Deep Learning Framework for Short-Duration Speaker Verification in Adverse Environments
Figure 2 for A Unified Deep Learning Framework for Short-Duration Speaker Verification in Adverse Environments
Figure 3 for A Unified Deep Learning Framework for Short-Duration Speaker Verification in Adverse Environments
Figure 4 for A Unified Deep Learning Framework for Short-Duration Speaker Verification in Adverse Environments
Viaarxiv icon

Deep MOS Predictor for Synthetic Speech Using Cluster-Based Modeling

Add code
Aug 09, 2020
Figure 1 for Deep MOS Predictor for Synthetic Speech Using Cluster-Based Modeling
Figure 2 for Deep MOS Predictor for Synthetic Speech Using Cluster-Based Modeling
Figure 3 for Deep MOS Predictor for Synthetic Speech Using Cluster-Based Modeling
Figure 4 for Deep MOS Predictor for Synthetic Speech Using Cluster-Based Modeling
Viaarxiv icon

Neural MOS Prediction for Synthesized Speech Using Multi-Task Learning With Spoofing Detection and Spoofing Type Classification

Add code
Jul 16, 2020
Figure 1 for Neural MOS Prediction for Synthesized Speech Using Multi-Task Learning With Spoofing Detection and Spoofing Type Classification
Figure 2 for Neural MOS Prediction for Synthesized Speech Using Multi-Task Learning With Spoofing Detection and Spoofing Type Classification
Figure 3 for Neural MOS Prediction for Synthesized Speech Using Multi-Task Learning With Spoofing Detection and Spoofing Type Classification
Figure 4 for Neural MOS Prediction for Synthesized Speech Using Multi-Task Learning With Spoofing Detection and Spoofing Type Classification
Viaarxiv icon

Multi-Task Network for Noise-Robust Keyword Spotting and Speaker Verification using CTC-based Soft VAD and Global Query Attention

Add code
May 16, 2020
Figure 1 for Multi-Task Network for Noise-Robust Keyword Spotting and Speaker Verification using CTC-based Soft VAD and Global Query Attention
Figure 2 for Multi-Task Network for Noise-Robust Keyword Spotting and Speaker Verification using CTC-based Soft VAD and Global Query Attention
Figure 3 for Multi-Task Network for Noise-Robust Keyword Spotting and Speaker Verification using CTC-based Soft VAD and Global Query Attention
Figure 4 for Multi-Task Network for Noise-Robust Keyword Spotting and Speaker Verification using CTC-based Soft VAD and Global Query Attention
Viaarxiv icon

Multi-Scale Aggregation Using Feature Pyramid Module for Text-Independent Speaker Verification

Add code
Apr 14, 2020
Figure 1 for Multi-Scale Aggregation Using Feature Pyramid Module for Text-Independent Speaker Verification
Figure 2 for Multi-Scale Aggregation Using Feature Pyramid Module for Text-Independent Speaker Verification
Figure 3 for Multi-Scale Aggregation Using Feature Pyramid Module for Text-Independent Speaker Verification
Figure 4 for Multi-Scale Aggregation Using Feature Pyramid Module for Text-Independent Speaker Verification
Viaarxiv icon

Meta-Learning for Short Utterance Speaker Recognition with Imbalance Length Pairs

Add code
Apr 06, 2020
Figure 1 for Meta-Learning for Short Utterance Speaker Recognition with Imbalance Length Pairs
Figure 2 for Meta-Learning for Short Utterance Speaker Recognition with Imbalance Length Pairs
Figure 3 for Meta-Learning for Short Utterance Speaker Recognition with Imbalance Length Pairs
Figure 4 for Meta-Learning for Short Utterance Speaker Recognition with Imbalance Length Pairs
Viaarxiv icon