Picture for Dan Su

Dan Su

Celine

End-to-End Voice Conversion with Information Perturbation

Add code
Jun 15, 2022
Figure 1 for End-to-End Voice Conversion with Information Perturbation
Figure 2 for End-to-End Voice Conversion with Information Perturbation
Figure 3 for End-to-End Voice Conversion with Information Perturbation
Figure 4 for End-to-End Voice Conversion with Information Perturbation
Viaarxiv icon

AdaVITS: Tiny VITS for Low Computing Resource Speaker Adaptation

Add code
Jun 01, 2022
Figure 1 for AdaVITS: Tiny VITS for Low Computing Resource Speaker Adaptation
Figure 2 for AdaVITS: Tiny VITS for Low Computing Resource Speaker Adaptation
Figure 3 for AdaVITS: Tiny VITS for Low Computing Resource Speaker Adaptation
Figure 4 for AdaVITS: Tiny VITS for Low Computing Resource Speaker Adaptation
Viaarxiv icon

AiSocrates: Towards Answering Ethical Quandary Questions

Add code
May 24, 2022
Figure 1 for AiSocrates: Towards Answering Ethical Quandary Questions
Figure 2 for AiSocrates: Towards Answering Ethical Quandary Questions
Figure 3 for AiSocrates: Towards Answering Ethical Quandary Questions
Figure 4 for AiSocrates: Towards Answering Ethical Quandary Questions
Viaarxiv icon

FastDiff: A Fast Conditional Diffusion Model for High-Quality Speech Synthesis

Add code
Apr 21, 2022
Figure 1 for FastDiff: A Fast Conditional Diffusion Model for High-Quality Speech Synthesis
Figure 2 for FastDiff: A Fast Conditional Diffusion Model for High-Quality Speech Synthesis
Figure 3 for FastDiff: A Fast Conditional Diffusion Model for High-Quality Speech Synthesis
Figure 4 for FastDiff: A Fast Conditional Diffusion Model for High-Quality Speech Synthesis
Viaarxiv icon

3M: Multi-loss, Multi-path and Multi-level Neural Networks for speech recognition

Add code
Apr 14, 2022
Figure 1 for 3M: Multi-loss, Multi-path and Multi-level Neural Networks for speech recognition
Figure 2 for 3M: Multi-loss, Multi-path and Multi-level Neural Networks for speech recognition
Figure 3 for 3M: Multi-loss, Multi-path and Multi-level Neural Networks for speech recognition
Figure 4 for 3M: Multi-loss, Multi-path and Multi-level Neural Networks for speech recognition
Viaarxiv icon

Content-Dependent Fine-Grained Speaker Embedding for Zero-Shot Speaker Adaptation in Text-to-Speech Synthesis

Add code
Apr 03, 2022
Figure 1 for Content-Dependent Fine-Grained Speaker Embedding for Zero-Shot Speaker Adaptation in Text-to-Speech Synthesis
Figure 2 for Content-Dependent Fine-Grained Speaker Embedding for Zero-Shot Speaker Adaptation in Text-to-Speech Synthesis
Figure 3 for Content-Dependent Fine-Grained Speaker Embedding for Zero-Shot Speaker Adaptation in Text-to-Speech Synthesis
Figure 4 for Content-Dependent Fine-Grained Speaker Embedding for Zero-Shot Speaker Adaptation in Text-to-Speech Synthesis
Viaarxiv icon

BDDM: Bilateral Denoising Diffusion Models for Fast and High-Quality Speech Synthesis

Add code
Mar 25, 2022
Figure 1 for BDDM: Bilateral Denoising Diffusion Models for Fast and High-Quality Speech Synthesis
Figure 2 for BDDM: Bilateral Denoising Diffusion Models for Fast and High-Quality Speech Synthesis
Figure 3 for BDDM: Bilateral Denoising Diffusion Models for Fast and High-Quality Speech Synthesis
Figure 4 for BDDM: Bilateral Denoising Diffusion Models for Fast and High-Quality Speech Synthesis
Viaarxiv icon

Read before Generate! Faithful Long Form Question Answering with Machine Reading

Add code
Mar 01, 2022
Figure 1 for Read before Generate! Faithful Long Form Question Answering with Machine Reading
Figure 2 for Read before Generate! Faithful Long Form Question Answering with Machine Reading
Figure 3 for Read before Generate! Faithful Long Form Question Answering with Machine Reading
Figure 4 for Read before Generate! Faithful Long Form Question Answering with Machine Reading
Viaarxiv icon

VCVTS: Multi-speaker Video-to-Speech synthesis via cross-modal knowledge transfer from voice conversion

Add code
Feb 18, 2022
Figure 1 for VCVTS: Multi-speaker Video-to-Speech synthesis via cross-modal knowledge transfer from voice conversion
Figure 2 for VCVTS: Multi-speaker Video-to-Speech synthesis via cross-modal knowledge transfer from voice conversion
Figure 3 for VCVTS: Multi-speaker Video-to-Speech synthesis via cross-modal knowledge transfer from voice conversion
Figure 4 for VCVTS: Multi-speaker Video-to-Speech synthesis via cross-modal knowledge transfer from voice conversion
Viaarxiv icon

QA4QG: Using Question Answering to Constrain Multi-Hop Question Generation

Add code
Feb 14, 2022
Figure 1 for QA4QG: Using Question Answering to Constrain Multi-Hop Question Generation
Figure 2 for QA4QG: Using Question Answering to Constrain Multi-Hop Question Generation
Figure 3 for QA4QG: Using Question Answering to Constrain Multi-Hop Question Generation
Figure 4 for QA4QG: Using Question Answering to Constrain Multi-Hop Question Generation
Viaarxiv icon