Alert button
Picture for Nam Soo Kim

Nam Soo Kim

Alert button

Utilizing Neural Transducers for Two-Stage Text-to-Speech via Semantic Token Prediction

Add code
Bookmark button
Alert button
Jan 03, 2024
Minchan Kim, Myeonghun Jeong, Byoung Jin Choi, Semin Kim, Joun Yeop Lee, Nam Soo Kim

Viaarxiv icon

Efficient Parallel Audio Generation using Group Masked Language Modeling

Add code
Bookmark button
Alert button
Jan 02, 2024
Myeonghun Jeong, Minchan Kim, Joun Yeop Lee, Nam Soo Kim

Figure 1 for Efficient Parallel Audio Generation using Group Masked Language Modeling
Figure 2 for Efficient Parallel Audio Generation using Group Masked Language Modeling
Figure 3 for Efficient Parallel Audio Generation using Group Masked Language Modeling
Figure 4 for Efficient Parallel Audio Generation using Group Masked Language Modeling
Viaarxiv icon

EEND-DEMUX: End-to-End Neural Speaker Diarization via Demultiplexed Speaker Embeddings

Add code
Bookmark button
Alert button
Dec 11, 2023
Sung Hwan Mun, Min Hyun Han, Canyeong Moon, Nam Soo Kim

Viaarxiv icon

Transduce and Speak: Neural Transducer for Text-to-Speech with Semantic Token Prediction

Add code
Bookmark button
Alert button
Nov 08, 2023
Minchan Kim, Myeonghun Jeong, Byoung Jin Choi, Dongjune Lee, Nam Soo Kim

Viaarxiv icon

EM-Network: Oracle Guided Self-distillation for Sequence Learning

Add code
Bookmark button
Alert button
Jun 14, 2023
Ji Won Yoon, Sunghwan Ahn, Hyeonseung Lee, Minchan Kim, Seok Min Kim, Nam Soo Kim

Figure 1 for EM-Network: Oracle Guided Self-distillation for Sequence Learning
Figure 2 for EM-Network: Oracle Guided Self-distillation for Sequence Learning
Figure 3 for EM-Network: Oracle Guided Self-distillation for Sequence Learning
Figure 4 for EM-Network: Oracle Guided Self-distillation for Sequence Learning
Viaarxiv icon

MCR-Data2vec 2.0: Improving Self-supervised Speech Pre-training via Model-level Consistency Regularization

Add code
Bookmark button
Alert button
Jun 14, 2023
Ji Won Yoon, Seok Min Kim, Nam Soo Kim

Figure 1 for MCR-Data2vec 2.0: Improving Self-supervised Speech Pre-training via Model-level Consistency Regularization
Figure 2 for MCR-Data2vec 2.0: Improving Self-supervised Speech Pre-training via Model-level Consistency Regularization
Figure 3 for MCR-Data2vec 2.0: Improving Self-supervised Speech Pre-training via Model-level Consistency Regularization
Figure 4 for MCR-Data2vec 2.0: Improving Self-supervised Speech Pre-training via Model-level Consistency Regularization
Viaarxiv icon

Towards single integrated spoofing-aware speaker verification embeddings

Add code
Bookmark button
Alert button
Jun 01, 2023
Sung Hwan Mun, Hye-jin Shim, Hemlata Tak, Xin Wang, Xuechen Liu, Md Sahidullah, Myeonghun Jeong, Min Hyun Han, Massimiliano Todisco, Kong Aik Lee, Junichi Yamagishi, Nicholas Evans, Tomi Kinnunen, Nam Soo Kim, Jee-weon Jung

Figure 1 for Towards single integrated spoofing-aware speaker verification embeddings
Figure 2 for Towards single integrated spoofing-aware speaker verification embeddings
Figure 3 for Towards single integrated spoofing-aware speaker verification embeddings
Figure 4 for Towards single integrated spoofing-aware speaker verification embeddings
Viaarxiv icon

When Crowd Meets Persona: Creating a Large-Scale Open-Domain Persona Dialogue Corpus

Add code
Bookmark button
Alert button
Apr 01, 2023
Won Ik Cho, Yoon Kyung Lee, Seoyeon Bae, Jihwan Kim, Sangah Park, Moosung Kim, Sowon Hahn, Nam Soo Kim

Figure 1 for When Crowd Meets Persona: Creating a Large-Scale Open-Domain Persona Dialogue Corpus
Figure 2 for When Crowd Meets Persona: Creating a Large-Scale Open-Domain Persona Dialogue Corpus
Figure 3 for When Crowd Meets Persona: Creating a Large-Scale Open-Domain Persona Dialogue Corpus
Figure 4 for When Crowd Meets Persona: Creating a Large-Scale Open-Domain Persona Dialogue Corpus
Viaarxiv icon

SNAC: Speaker-normalized affine coupling layer in flow-based architecture for zero-shot multi-speaker text-to-speech

Add code
Bookmark button
Alert button
Nov 30, 2022
Byoung Jin Choi, Myeonghun Jeong, Joun Yeop Lee, Nam Soo Kim

Figure 1 for SNAC: Speaker-normalized affine coupling layer in flow-based architecture for zero-shot multi-speaker text-to-speech
Figure 2 for SNAC: Speaker-normalized affine coupling layer in flow-based architecture for zero-shot multi-speaker text-to-speech
Viaarxiv icon

Inter-KD: Intermediate Knowledge Distillation for CTC-Based Automatic Speech Recognition

Add code
Bookmark button
Alert button
Nov 28, 2022
Ji Won Yoon, Beom Jun Woo, Sunghwan Ahn, Hyeonseung Lee, Nam Soo Kim

Figure 1 for Inter-KD: Intermediate Knowledge Distillation for CTC-Based Automatic Speech Recognition
Figure 2 for Inter-KD: Intermediate Knowledge Distillation for CTC-Based Automatic Speech Recognition
Figure 3 for Inter-KD: Intermediate Knowledge Distillation for CTC-Based Automatic Speech Recognition
Figure 4 for Inter-KD: Intermediate Knowledge Distillation for CTC-Based Automatic Speech Recognition
Viaarxiv icon