Alert button
Picture for Kaitao Song

Kaitao Song

Alert button

NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models

Add code
Bookmark button
Alert button
Mar 05, 2024
Zeqian Ju, Yuancheng Wang, Kai Shen, Xu Tan, Detai Xin, Dongchao Yang, Yanqing Liu, Yichong Leng, Kaitao Song, Siliang Tang, Zhizheng Wu, Tao Qin, Xiang-Yang Li, Wei Ye, Shikun Zhang, Jiang Bian, Lei He, Jinyu Li, Sheng Zhao

Figure 1 for NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models
Figure 2 for NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models
Figure 3 for NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models
Figure 4 for NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models
Viaarxiv icon

EEGFormer: Towards Transferable and Interpretable Large-Scale EEG Foundation Model

Add code
Bookmark button
Alert button
Jan 11, 2024
Yuqi Chen, Kan Ren, Kaitao Song, Yansen Wang, Yifan Wang, Dongsheng Li, Lili Qiu

Viaarxiv icon

EASYTOOL: Enhancing LLM-based Agents with Concise Tool Instruction

Add code
Bookmark button
Alert button
Jan 11, 2024
Siyu Yuan, Kaitao Song, Jiangjie Chen, Xu Tan, Yongliang Shen, Ren Kan, Dongsheng Li, Deqing Yang

Viaarxiv icon

TaskBench: Benchmarking Large Language Models for Task Automation

Add code
Bookmark button
Alert button
Nov 30, 2023
Yongliang Shen, Kaitao Song, Xu Tan, Wenqi Zhang, Kan Ren, Siyu Yuan, Weiming Lu, Dongsheng Li, Yueting Zhuang

Viaarxiv icon

MusicAgent: An AI Agent for Music Understanding and Generation with Large Language Models

Add code
Bookmark button
Alert button
Oct 25, 2023
Dingyao Yu, Kaitao Song, Peiling Lu, Tianyu He, Xu Tan, Wei Ye, Shikun Zhang, Jiang Bian

Figure 1 for MusicAgent: An AI Agent for Music Understanding and Generation with Large Language Models
Figure 2 for MusicAgent: An AI Agent for Music Understanding and Generation with Large Language Models
Figure 3 for MusicAgent: An AI Agent for Music Understanding and Generation with Large Language Models
Figure 4 for MusicAgent: An AI Agent for Music Understanding and Generation with Large Language Models
Viaarxiv icon

Learning To Teach Large Language Models Logical Reasoning

Add code
Bookmark button
Alert button
Oct 13, 2023
Meiqi Chen, Yubo Ma, Kaitao Song, Yixin Cao, Yan Zhang, Dongsheng Li

Viaarxiv icon

Connecting Large Language Models with Evolutionary Algorithms Yields Powerful Prompt Optimizers

Add code
Bookmark button
Alert button
Sep 15, 2023
Qingyan Guo, Rui Wang, Junliang Guo, Bei Li, Kaitao Song, Xu Tan, Guoqing Liu, Jiang Bian, Yujiu Yang

Figure 1 for Connecting Large Language Models with Evolutionary Algorithms Yields Powerful Prompt Optimizers
Figure 2 for Connecting Large Language Models with Evolutionary Algorithms Yields Powerful Prompt Optimizers
Figure 3 for Connecting Large Language Models with Evolutionary Algorithms Yields Powerful Prompt Optimizers
Figure 4 for Connecting Large Language Models with Evolutionary Algorithms Yields Powerful Prompt Optimizers
Viaarxiv icon

PromptTTS 2: Describing and Generating Voices with Text Prompt

Add code
Bookmark button
Alert button
Sep 05, 2023
Yichong Leng, Zhifang Guo, Kai Shen, Xu Tan, Zeqian Ju, Yanqing Liu, Yufei Liu, Dongchao Yang, Leying Zhang, Kaitao Song, Lei He, Xiang-Yang Li, Sheng Zhao, Tao Qin, Jiang Bian

Figure 1 for PromptTTS 2: Describing and Generating Voices with Text Prompt
Figure 2 for PromptTTS 2: Describing and Generating Voices with Text Prompt
Figure 3 for PromptTTS 2: Describing and Generating Voices with Text Prompt
Figure 4 for PromptTTS 2: Describing and Generating Voices with Text Prompt
Viaarxiv icon

End-to-End Word-Level Pronunciation Assessment with MASK Pre-training

Add code
Bookmark button
Alert button
Jun 05, 2023
Yukang Liang, Kaitao Song, Shaoguang Mao, Huiqiang Jiang, Luna Qiu, Yuqing Yang, Dongsheng Li, Linli Xu, Lili Qiu

Figure 1 for End-to-End Word-Level Pronunciation Assessment with MASK Pre-training
Figure 2 for End-to-End Word-Level Pronunciation Assessment with MASK Pre-training
Figure 3 for End-to-End Word-Level Pronunciation Assessment with MASK Pre-training
Figure 4 for End-to-End Word-Level Pronunciation Assessment with MASK Pre-training
Viaarxiv icon