Picture for Myeonghun Jeong

Myeonghun Jeong

ASVspoof 5: Design, Collection and Validation of Resources for Spoofing, Deepfake, and Adversarial Attack Detection Using Crowdsourced Speech

Add code
Feb 13, 2025
Figure 1 for ASVspoof 5: Design, Collection and Validation of Resources for Spoofing, Deepfake, and Adversarial Attack Detection Using Crowdsourced Speech
Figure 2 for ASVspoof 5: Design, Collection and Validation of Resources for Spoofing, Deepfake, and Adversarial Attack Detection Using Crowdsourced Speech
Figure 3 for ASVspoof 5: Design, Collection and Validation of Resources for Spoofing, Deepfake, and Adversarial Attack Detection Using Crowdsourced Speech
Figure 4 for ASVspoof 5: Design, Collection and Validation of Resources for Spoofing, Deepfake, and Adversarial Attack Detection Using Crowdsourced Speech
Viaarxiv icon

SegINR: Segment-wise Implicit Neural Representation for Sequence Alignment in Neural Text-to-Speech

Add code
Oct 07, 2024
Figure 1 for SegINR: Segment-wise Implicit Neural Representation for Sequence Alignment in Neural Text-to-Speech
Figure 2 for SegINR: Segment-wise Implicit Neural Representation for Sequence Alignment in Neural Text-to-Speech
Figure 3 for SegINR: Segment-wise Implicit Neural Representation for Sequence Alignment in Neural Text-to-Speech
Figure 4 for SegINR: Segment-wise Implicit Neural Representation for Sequence Alignment in Neural Text-to-Speech
Viaarxiv icon

High Fidelity Text-to-Speech Via Discrete Tokens Using Token Transducer and Group Masked Language Model

Add code
Jun 25, 2024
Figure 1 for High Fidelity Text-to-Speech Via Discrete Tokens Using Token Transducer and Group Masked Language Model
Figure 2 for High Fidelity Text-to-Speech Via Discrete Tokens Using Token Transducer and Group Masked Language Model
Figure 3 for High Fidelity Text-to-Speech Via Discrete Tokens Using Token Transducer and Group Masked Language Model
Figure 4 for High Fidelity Text-to-Speech Via Discrete Tokens Using Token Transducer and Group Masked Language Model
Viaarxiv icon

MakeSinger: A Semi-Supervised Training Method for Data-Efficient Singing Voice Synthesis via Classifier-free Diffusion Guidance

Add code
Jun 10, 2024
Figure 1 for MakeSinger: A Semi-Supervised Training Method for Data-Efficient Singing Voice Synthesis via Classifier-free Diffusion Guidance
Figure 2 for MakeSinger: A Semi-Supervised Training Method for Data-Efficient Singing Voice Synthesis via Classifier-free Diffusion Guidance
Figure 3 for MakeSinger: A Semi-Supervised Training Method for Data-Efficient Singing Voice Synthesis via Classifier-free Diffusion Guidance
Viaarxiv icon

Utilizing Neural Transducers for Two-Stage Text-to-Speech via Semantic Token Prediction

Add code
Jan 03, 2024
Figure 1 for Utilizing Neural Transducers for Two-Stage Text-to-Speech via Semantic Token Prediction
Figure 2 for Utilizing Neural Transducers for Two-Stage Text-to-Speech via Semantic Token Prediction
Figure 3 for Utilizing Neural Transducers for Two-Stage Text-to-Speech via Semantic Token Prediction
Figure 4 for Utilizing Neural Transducers for Two-Stage Text-to-Speech via Semantic Token Prediction
Viaarxiv icon

Efficient Parallel Audio Generation using Group Masked Language Modeling

Add code
Jan 02, 2024
Figure 1 for Efficient Parallel Audio Generation using Group Masked Language Modeling
Figure 2 for Efficient Parallel Audio Generation using Group Masked Language Modeling
Figure 3 for Efficient Parallel Audio Generation using Group Masked Language Modeling
Figure 4 for Efficient Parallel Audio Generation using Group Masked Language Modeling
Viaarxiv icon

Transduce and Speak: Neural Transducer for Text-to-Speech with Semantic Token Prediction

Add code
Nov 08, 2023
Viaarxiv icon

Towards single integrated spoofing-aware speaker verification embeddings

Add code
Jun 01, 2023
Figure 1 for Towards single integrated spoofing-aware speaker verification embeddings
Figure 2 for Towards single integrated spoofing-aware speaker verification embeddings
Figure 3 for Towards single integrated spoofing-aware speaker verification embeddings
Figure 4 for Towards single integrated spoofing-aware speaker verification embeddings
Viaarxiv icon

SNAC: Speaker-normalized affine coupling layer in flow-based architecture for zero-shot multi-speaker text-to-speech

Add code
Nov 30, 2022
Viaarxiv icon

Adversarial Speaker-Consistency Learning Using Untranscribed Speech Data for Zero-Shot Multi-Speaker Text-to-Speech

Add code
Oct 12, 2022
Figure 1 for Adversarial Speaker-Consistency Learning Using Untranscribed Speech Data for Zero-Shot Multi-Speaker Text-to-Speech
Figure 2 for Adversarial Speaker-Consistency Learning Using Untranscribed Speech Data for Zero-Shot Multi-Speaker Text-to-Speech
Figure 3 for Adversarial Speaker-Consistency Learning Using Untranscribed Speech Data for Zero-Shot Multi-Speaker Text-to-Speech
Figure 4 for Adversarial Speaker-Consistency Learning Using Untranscribed Speech Data for Zero-Shot Multi-Speaker Text-to-Speech
Viaarxiv icon