Alert button
Picture for Dongchao Yang

Dongchao Yang

Alert button

RALL-E: Robust Codec Language Modeling with Chain-of-Thought Prompting for Text-to-Speech Synthesis

Add code
Bookmark button
Alert button
Apr 06, 2024
Detai Xin, Xu Tan, Kai Shen, Zeqian Ju, Dongchao Yang, Yuancheng Wang, Shinnosuke Takamichi, Hiroshi Saruwatari, Shujie Liu, Jinyu Li, Sheng Zhao

Viaarxiv icon

NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models

Add code
Bookmark button
Alert button
Mar 05, 2024
Zeqian Ju, Yuancheng Wang, Kai Shen, Xu Tan, Detai Xin, Dongchao Yang, Yanqing Liu, Yichong Leng, Kaitao Song, Siliang Tang, Zhizheng Wu, Tao Qin, Xiang-Yang Li, Wei Ye, Shikun Zhang, Jiang Bian, Lei He, Jinyu Li, Sheng Zhao

Figure 1 for NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models
Figure 2 for NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models
Figure 3 for NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models
Figure 4 for NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models
Viaarxiv icon

Consistent and Relevant: Rethink the Query Embedding in General Sound Separation

Add code
Bookmark button
Alert button
Dec 24, 2023
Yuanyuan Wang, Hangting Chen, Dongchao Yang, Jianwei Yu, Chao Weng, Zhiyong Wu, Helen Meng

Viaarxiv icon

UniAudio: An Audio Foundation Model Toward Universal Audio Generation

Add code
Bookmark button
Alert button
Oct 11, 2023
Dongchao Yang, Jinchuan Tian, Xu Tan, Rongjie Huang, Songxiang Liu, Xuankai Chang, Jiatong Shi, Sheng Zhao, Jiang Bian, Xixin Wu, Zhou Zhao, Shinji Watanabe, Helen Meng

Figure 1 for UniAudio: An Audio Foundation Model Toward Universal Audio Generation
Figure 2 for UniAudio: An Audio Foundation Model Toward Universal Audio Generation
Figure 3 for UniAudio: An Audio Foundation Model Toward Universal Audio Generation
Figure 4 for UniAudio: An Audio Foundation Model Toward Universal Audio Generation
Viaarxiv icon

DPM-TSE: A Diffusion Probabilistic Model for Target Sound Extraction

Add code
Bookmark button
Alert button
Oct 10, 2023
Jiarui Hai, Helin Wang, Dongchao Yang, Karan Thakkar, Najim Dehak, Mounya Elhilali

Viaarxiv icon

PromptTTS 2: Describing and Generating Voices with Text Prompt

Add code
Bookmark button
Alert button
Sep 05, 2023
Yichong Leng, Zhifang Guo, Kai Shen, Xu Tan, Zeqian Ju, Yanqing Liu, Yufei Liu, Dongchao Yang, Leying Zhang, Kaitao Song, Lei He, Xiang-Yang Li, Sheng Zhao, Tao Qin, Jiang Bian

Figure 1 for PromptTTS 2: Describing and Generating Voices with Text Prompt
Figure 2 for PromptTTS 2: Describing and Generating Voices with Text Prompt
Figure 3 for PromptTTS 2: Describing and Generating Voices with Text Prompt
Figure 4 for PromptTTS 2: Describing and Generating Voices with Text Prompt
Viaarxiv icon

NADiffuSE: Noise-aware Diffusion-based Model for Speech Enhancement

Add code
Bookmark button
Alert button
Sep 03, 2023
Wen Wang, Dongchao Yang, Qichen Ye, Bowen Cao, Yuexian Zou

Figure 1 for NADiffuSE: Noise-aware Diffusion-based Model for Speech Enhancement
Figure 2 for NADiffuSE: Noise-aware Diffusion-based Model for Speech Enhancement
Figure 3 for NADiffuSE: Noise-aware Diffusion-based Model for Speech Enhancement
Figure 4 for NADiffuSE: Noise-aware Diffusion-based Model for Speech Enhancement
Viaarxiv icon

Make-A-Voice: Unified Voice Synthesis With Discrete Representation

Add code
Bookmark button
Alert button
May 30, 2023
Rongjie Huang, Chunlei Zhang, Yongqi Wang, Dongchao Yang, Luping Liu, Zhenhui Ye, Ziyue Jiang, Chao Weng, Zhou Zhao, Dong Yu

Figure 1 for Make-A-Voice: Unified Voice Synthesis With Discrete Representation
Figure 2 for Make-A-Voice: Unified Voice Synthesis With Discrete Representation
Figure 3 for Make-A-Voice: Unified Voice Synthesis With Discrete Representation
Figure 4 for Make-A-Voice: Unified Voice Synthesis With Discrete Representation
Viaarxiv icon

Make-An-Audio 2: Temporal-Enhanced Text-to-Audio Generation

Add code
Bookmark button
Alert button
May 29, 2023
Jiawei Huang, Yi Ren, Rongjie Huang, Dongchao Yang, Zhenhui Ye, Chen Zhang, Jinglin Liu, Xiang Yin, Zejun Ma, Zhou Zhao

Figure 1 for Make-An-Audio 2: Temporal-Enhanced Text-to-Audio Generation
Figure 2 for Make-An-Audio 2: Temporal-Enhanced Text-to-Audio Generation
Figure 3 for Make-An-Audio 2: Temporal-Enhanced Text-to-Audio Generation
Figure 4 for Make-An-Audio 2: Temporal-Enhanced Text-to-Audio Generation
Viaarxiv icon

HiFi-Codec: Group-residual Vector quantization for High Fidelity Audio Codec

Add code
Bookmark button
Alert button
May 07, 2023
Dongchao Yang, Songxiang Liu, Rongjie Huang, Jinchuan Tian, Chao Weng, Yuexian Zou

Figure 1 for HiFi-Codec: Group-residual Vector quantization for High Fidelity Audio Codec
Figure 2 for HiFi-Codec: Group-residual Vector quantization for High Fidelity Audio Codec
Viaarxiv icon