Picture for Xu Tan

Xu Tan

RALL-E: Robust Codec Language Modeling with Chain-of-Thought Prompting for Text-to-Speech Synthesis

Add code
Apr 06, 2024
Figure 1 for RALL-E: Robust Codec Language Modeling with Chain-of-Thought Prompting for Text-to-Speech Synthesis
Figure 2 for RALL-E: Robust Codec Language Modeling with Chain-of-Thought Prompting for Text-to-Speech Synthesis
Figure 3 for RALL-E: Robust Codec Language Modeling with Chain-of-Thought Prompting for Text-to-Speech Synthesis
Figure 4 for RALL-E: Robust Codec Language Modeling with Chain-of-Thought Prompting for Text-to-Speech Synthesis
Viaarxiv icon

Mitigating Reversal Curse via Semantic-aware Permutation Training

Add code
Mar 07, 2024
Figure 1 for Mitigating Reversal Curse via Semantic-aware Permutation Training
Figure 2 for Mitigating Reversal Curse via Semantic-aware Permutation Training
Figure 3 for Mitigating Reversal Curse via Semantic-aware Permutation Training
Figure 4 for Mitigating Reversal Curse via Semantic-aware Permutation Training
Viaarxiv icon

NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models

Add code
Mar 05, 2024
Viaarxiv icon

Beyond Language Models: Byte Models are Digital World Simulators

Add code
Feb 29, 2024
Figure 1 for Beyond Language Models: Byte Models are Digital World Simulators
Figure 2 for Beyond Language Models: Byte Models are Digital World Simulators
Figure 3 for Beyond Language Models: Byte Models are Digital World Simulators
Figure 4 for Beyond Language Models: Byte Models are Digital World Simulators
Viaarxiv icon

EASYTOOL: Enhancing LLM-based Agents with Concise Tool Instruction

Add code
Jan 11, 2024
Viaarxiv icon

xTrimoPGLM: Unified 100B-Scale Pre-trained Transformer for Deciphering the Language of Protein

Add code
Jan 11, 2024
Viaarxiv icon

CoMoSVC: Consistency Model-based Singing Voice Conversion

Add code
Jan 03, 2024
Figure 1 for CoMoSVC: Consistency Model-based Singing Voice Conversion
Figure 2 for CoMoSVC: Consistency Model-based Singing Voice Conversion
Figure 3 for CoMoSVC: Consistency Model-based Singing Voice Conversion
Figure 4 for CoMoSVC: Consistency Model-based Singing Voice Conversion
Viaarxiv icon

Unraveling Key Factors of Knowledge Distillation

Add code
Dec 24, 2023
Viaarxiv icon

Schrodinger Bridges Beat Diffusion Models on Text-to-Speech Synthesis

Add code
Dec 06, 2023
Figure 1 for Schrodinger Bridges Beat Diffusion Models on Text-to-Speech Synthesis
Figure 2 for Schrodinger Bridges Beat Diffusion Models on Text-to-Speech Synthesis
Figure 3 for Schrodinger Bridges Beat Diffusion Models on Text-to-Speech Synthesis
Figure 4 for Schrodinger Bridges Beat Diffusion Models on Text-to-Speech Synthesis
Viaarxiv icon

TaskBench: Benchmarking Large Language Models for Task Automation

Add code
Nov 30, 2023
Figure 1 for TaskBench: Benchmarking Large Language Models for Task Automation
Figure 2 for TaskBench: Benchmarking Large Language Models for Task Automation
Figure 3 for TaskBench: Benchmarking Large Language Models for Task Automation
Figure 4 for TaskBench: Benchmarking Large Language Models for Task Automation
Viaarxiv icon