Alert button
Picture for Dongchao Yang

Dongchao Yang

Alert button

Consistent and Relevant: Rethink the Query Embedding in General Sound Separation

Dec 24, 2023
Yuanyuan Wang, Hangting Chen, Dongchao Yang, Jianwei Yu, Chao Weng, Zhiyong Wu, Helen Meng

Viaarxiv icon

UniAudio: An Audio Foundation Model Toward Universal Audio Generation

Oct 11, 2023
Dongchao Yang, Jinchuan Tian, Xu Tan, Rongjie Huang, Songxiang Liu, Xuankai Chang, Jiatong Shi, Sheng Zhao, Jiang Bian, Xixin Wu, Zhou Zhao, Shinji Watanabe, Helen Meng

Figure 1 for UniAudio: An Audio Foundation Model Toward Universal Audio Generation
Figure 2 for UniAudio: An Audio Foundation Model Toward Universal Audio Generation
Figure 3 for UniAudio: An Audio Foundation Model Toward Universal Audio Generation
Figure 4 for UniAudio: An Audio Foundation Model Toward Universal Audio Generation
Viaarxiv icon

DPM-TSE: A Diffusion Probabilistic Model for Target Sound Extraction

Oct 10, 2023
Jiarui Hai, Helin Wang, Dongchao Yang, Karan Thakkar, Najim Dehak, Mounya Elhilali

Viaarxiv icon

PromptTTS 2: Describing and Generating Voices with Text Prompt

Sep 05, 2023
Yichong Leng, Zhifang Guo, Kai Shen, Xu Tan, Zeqian Ju, Yanqing Liu, Yufei Liu, Dongchao Yang, Leying Zhang, Kaitao Song, Lei He, Xiang-Yang Li, Sheng Zhao, Tao Qin, Jiang Bian

Figure 1 for PromptTTS 2: Describing and Generating Voices with Text Prompt
Figure 2 for PromptTTS 2: Describing and Generating Voices with Text Prompt
Figure 3 for PromptTTS 2: Describing and Generating Voices with Text Prompt
Figure 4 for PromptTTS 2: Describing and Generating Voices with Text Prompt
Viaarxiv icon

NADiffuSE: Noise-aware Diffusion-based Model for Speech Enhancement

Sep 03, 2023
Wen Wang, Dongchao Yang, Qichen Ye, Bowen Cao, Yuexian Zou

Figure 1 for NADiffuSE: Noise-aware Diffusion-based Model for Speech Enhancement
Figure 2 for NADiffuSE: Noise-aware Diffusion-based Model for Speech Enhancement
Figure 3 for NADiffuSE: Noise-aware Diffusion-based Model for Speech Enhancement
Figure 4 for NADiffuSE: Noise-aware Diffusion-based Model for Speech Enhancement
Viaarxiv icon

Make-A-Voice: Unified Voice Synthesis With Discrete Representation

May 30, 2023
Rongjie Huang, Chunlei Zhang, Yongqi Wang, Dongchao Yang, Luping Liu, Zhenhui Ye, Ziyue Jiang, Chao Weng, Zhou Zhao, Dong Yu

Figure 1 for Make-A-Voice: Unified Voice Synthesis With Discrete Representation
Figure 2 for Make-A-Voice: Unified Voice Synthesis With Discrete Representation
Figure 3 for Make-A-Voice: Unified Voice Synthesis With Discrete Representation
Figure 4 for Make-A-Voice: Unified Voice Synthesis With Discrete Representation
Viaarxiv icon

Make-An-Audio 2: Temporal-Enhanced Text-to-Audio Generation

May 29, 2023
Jiawei Huang, Yi Ren, Rongjie Huang, Dongchao Yang, Zhenhui Ye, Chen Zhang, Jinglin Liu, Xiang Yin, Zejun Ma, Zhou Zhao

Figure 1 for Make-An-Audio 2: Temporal-Enhanced Text-to-Audio Generation
Figure 2 for Make-An-Audio 2: Temporal-Enhanced Text-to-Audio Generation
Figure 3 for Make-An-Audio 2: Temporal-Enhanced Text-to-Audio Generation
Figure 4 for Make-An-Audio 2: Temporal-Enhanced Text-to-Audio Generation
Viaarxiv icon

HiFi-Codec: Group-residual Vector quantization for High Fidelity Audio Codec

May 07, 2023
Dongchao Yang, Songxiang Liu, Rongjie Huang, Jinchuan Tian, Chao Weng, Yuexian Zou

Figure 1 for HiFi-Codec: Group-residual Vector quantization for High Fidelity Audio Codec
Figure 2 for HiFi-Codec: Group-residual Vector quantization for High Fidelity Audio Codec
Viaarxiv icon

AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head

Apr 25, 2023
Rongjie Huang, Mingze Li, Dongchao Yang, Jiatong Shi, Xuankai Chang, Zhenhui Ye, Yuning Wu, Zhiqing Hong, Jiawei Huang, Jinglin Liu, Yi Ren, Zhou Zhao, Shinji Watanabe

Figure 1 for AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head
Figure 2 for AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head
Figure 3 for AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head
Figure 4 for AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head
Viaarxiv icon