Alert button
Picture for Dan Su

Dan Su

Alert button

AdaVITS: Tiny VITS for Low Computing Resource Speaker Adaptation

Jun 01, 2022
Kun Song, Heyang Xue, Xinsheng Wang, Jian Cong, Yongmao Zhang, Lei Xie, Bing Yang, Xiong Zhang, Dan Su

Figure 1 for AdaVITS: Tiny VITS for Low Computing Resource Speaker Adaptation
Figure 2 for AdaVITS: Tiny VITS for Low Computing Resource Speaker Adaptation
Figure 3 for AdaVITS: Tiny VITS for Low Computing Resource Speaker Adaptation
Figure 4 for AdaVITS: Tiny VITS for Low Computing Resource Speaker Adaptation
Viaarxiv icon

AiSocrates: Towards Answering Ethical Quandary Questions

May 24, 2022
Yejin Bang, Nayeon Lee, Tiezheng Yu, Leila Khalatbari, Yan Xu, Dan Su, Elham J. Barezi, Andrea Madotto, Hayden Kee, Pascale Fung

Figure 1 for AiSocrates: Towards Answering Ethical Quandary Questions
Figure 2 for AiSocrates: Towards Answering Ethical Quandary Questions
Figure 3 for AiSocrates: Towards Answering Ethical Quandary Questions
Figure 4 for AiSocrates: Towards Answering Ethical Quandary Questions
Viaarxiv icon

FastDiff: A Fast Conditional Diffusion Model for High-Quality Speech Synthesis

Apr 21, 2022
Rongjie Huang, Max W. Y. Lam, Jun Wang, Dan Su, Dong Yu, Yi Ren, Zhou Zhao

Figure 1 for FastDiff: A Fast Conditional Diffusion Model for High-Quality Speech Synthesis
Figure 2 for FastDiff: A Fast Conditional Diffusion Model for High-Quality Speech Synthesis
Figure 3 for FastDiff: A Fast Conditional Diffusion Model for High-Quality Speech Synthesis
Figure 4 for FastDiff: A Fast Conditional Diffusion Model for High-Quality Speech Synthesis
Viaarxiv icon

3M: Multi-loss, Multi-path and Multi-level Neural Networks for speech recognition

Apr 14, 2022
Zhao You, Shulin Feng, Dan Su, Dong Yu

Figure 1 for 3M: Multi-loss, Multi-path and Multi-level Neural Networks for speech recognition
Figure 2 for 3M: Multi-loss, Multi-path and Multi-level Neural Networks for speech recognition
Figure 3 for 3M: Multi-loss, Multi-path and Multi-level Neural Networks for speech recognition
Figure 4 for 3M: Multi-loss, Multi-path and Multi-level Neural Networks for speech recognition
Viaarxiv icon

Content-Dependent Fine-Grained Speaker Embedding for Zero-Shot Speaker Adaptation in Text-to-Speech Synthesis

Apr 03, 2022
Yixuan Zhou, Changhe Song, Xiang Li, Luwen Zhang, Zhiyong Wu, Yanyao Bian, Dan Su, Helen Meng

Figure 1 for Content-Dependent Fine-Grained Speaker Embedding for Zero-Shot Speaker Adaptation in Text-to-Speech Synthesis
Figure 2 for Content-Dependent Fine-Grained Speaker Embedding for Zero-Shot Speaker Adaptation in Text-to-Speech Synthesis
Figure 3 for Content-Dependent Fine-Grained Speaker Embedding for Zero-Shot Speaker Adaptation in Text-to-Speech Synthesis
Figure 4 for Content-Dependent Fine-Grained Speaker Embedding for Zero-Shot Speaker Adaptation in Text-to-Speech Synthesis
Viaarxiv icon

BDDM: Bilateral Denoising Diffusion Models for Fast and High-Quality Speech Synthesis

Mar 25, 2022
Max W. Y. Lam, Jun Wang, Dan Su, Dong Yu

Figure 1 for BDDM: Bilateral Denoising Diffusion Models for Fast and High-Quality Speech Synthesis
Figure 2 for BDDM: Bilateral Denoising Diffusion Models for Fast and High-Quality Speech Synthesis
Figure 3 for BDDM: Bilateral Denoising Diffusion Models for Fast and High-Quality Speech Synthesis
Figure 4 for BDDM: Bilateral Denoising Diffusion Models for Fast and High-Quality Speech Synthesis
Viaarxiv icon

Read before Generate! Faithful Long Form Question Answering with Machine Reading

Mar 01, 2022
Dan Su, Xiaoguang Li, Jindi Zhang, Lifeng Shang, Xin Jiang, Qun Liu, Pascale Fung

Figure 1 for Read before Generate! Faithful Long Form Question Answering with Machine Reading
Figure 2 for Read before Generate! Faithful Long Form Question Answering with Machine Reading
Figure 3 for Read before Generate! Faithful Long Form Question Answering with Machine Reading
Figure 4 for Read before Generate! Faithful Long Form Question Answering with Machine Reading
Viaarxiv icon

VCVTS: Multi-speaker Video-to-Speech synthesis via cross-modal knowledge transfer from voice conversion

Feb 18, 2022
Disong Wang, Shan Yang, Dan Su, Xunying Liu, Dong Yu, Helen Meng

Figure 1 for VCVTS: Multi-speaker Video-to-Speech synthesis via cross-modal knowledge transfer from voice conversion
Figure 2 for VCVTS: Multi-speaker Video-to-Speech synthesis via cross-modal knowledge transfer from voice conversion
Figure 3 for VCVTS: Multi-speaker Video-to-Speech synthesis via cross-modal knowledge transfer from voice conversion
Figure 4 for VCVTS: Multi-speaker Video-to-Speech synthesis via cross-modal knowledge transfer from voice conversion
Viaarxiv icon