Picture for Shiyin Kang

Shiyin Kang

ChatMusician: Understanding and Generating Music Intrinsically with LLM

Add code
Feb 25, 2024
Figure 1 for ChatMusician: Understanding and Generating Music Intrinsically with LLM
Figure 2 for ChatMusician: Understanding and Generating Music Intrinsically with LLM
Figure 3 for ChatMusician: Understanding and Generating Music Intrinsically with LLM
Figure 4 for ChatMusician: Understanding and Generating Music Intrinsically with LLM
Viaarxiv icon

SCNet: Sparse Compression Network for Music Source Separation

Add code
Jan 24, 2024
Figure 1 for SCNet: Sparse Compression Network for Music Source Separation
Figure 2 for SCNet: Sparse Compression Network for Music Source Separation
Figure 3 for SCNet: Sparse Compression Network for Music Source Separation
Figure 4 for SCNet: Sparse Compression Network for Music Source Separation
Viaarxiv icon

Multi-view MidiVAE: Fusing Track- and Bar-view Representations for Long Multi-track Symbolic Music Generation

Add code
Jan 15, 2024
Figure 1 for Multi-view MidiVAE: Fusing Track- and Bar-view Representations for Long Multi-track Symbolic Music Generation
Figure 2 for Multi-view MidiVAE: Fusing Track- and Bar-view Representations for Long Multi-track Symbolic Music Generation
Figure 3 for Multi-view MidiVAE: Fusing Track- and Bar-view Representations for Long Multi-track Symbolic Music Generation
Figure 4 for Multi-view MidiVAE: Fusing Track- and Bar-view Representations for Long Multi-track Symbolic Music Generation
Viaarxiv icon

AdaMesh: Personalized Facial Expressions and Head Poses for Speech-Driven 3D Facial Animation

Add code
Oct 11, 2023
Figure 1 for AdaMesh: Personalized Facial Expressions and Head Poses for Speech-Driven 3D Facial Animation
Figure 2 for AdaMesh: Personalized Facial Expressions and Head Poses for Speech-Driven 3D Facial Animation
Figure 3 for AdaMesh: Personalized Facial Expressions and Head Poses for Speech-Driven 3D Facial Animation
Figure 4 for AdaMesh: Personalized Facial Expressions and Head Poses for Speech-Driven 3D Facial Animation
Viaarxiv icon

Improving Language Model-Based Zero-Shot Text-to-Speech Synthesis with Multi-Scale Acoustic Prompts

Add code
Sep 22, 2023
Figure 1 for Improving Language Model-Based Zero-Shot Text-to-Speech Synthesis with Multi-Scale Acoustic Prompts
Figure 2 for Improving Language Model-Based Zero-Shot Text-to-Speech Synthesis with Multi-Scale Acoustic Prompts
Figure 3 for Improving Language Model-Based Zero-Shot Text-to-Speech Synthesis with Multi-Scale Acoustic Prompts
Figure 4 for Improving Language Model-Based Zero-Shot Text-to-Speech Synthesis with Multi-Scale Acoustic Prompts
Viaarxiv icon

Improving Mandarin Prosodic Structure Prediction with Multi-level Contextual Information

Add code
Aug 31, 2023
Figure 1 for Improving Mandarin Prosodic Structure Prediction with Multi-level Contextual Information
Figure 2 for Improving Mandarin Prosodic Structure Prediction with Multi-level Contextual Information
Figure 3 for Improving Mandarin Prosodic Structure Prediction with Multi-level Contextual Information
Figure 4 for Improving Mandarin Prosodic Structure Prediction with Multi-level Contextual Information
Viaarxiv icon

Towards Improving the Expressiveness of Singing Voice Synthesis with BERT Derived Semantic Information

Add code
Aug 31, 2023
Figure 1 for Towards Improving the Expressiveness of Singing Voice Synthesis with BERT Derived Semantic Information
Figure 2 for Towards Improving the Expressiveness of Singing Voice Synthesis with BERT Derived Semantic Information
Figure 3 for Towards Improving the Expressiveness of Singing Voice Synthesis with BERT Derived Semantic Information
Figure 4 for Towards Improving the Expressiveness of Singing Voice Synthesis with BERT Derived Semantic Information
Viaarxiv icon

Towards Spontaneous Style Modeling with Semi-supervised Pre-training for Conversational Text-to-Speech Synthesis

Add code
Aug 31, 2023
Figure 1 for Towards Spontaneous Style Modeling with Semi-supervised Pre-training for Conversational Text-to-Speech Synthesis
Figure 2 for Towards Spontaneous Style Modeling with Semi-supervised Pre-training for Conversational Text-to-Speech Synthesis
Figure 3 for Towards Spontaneous Style Modeling with Semi-supervised Pre-training for Conversational Text-to-Speech Synthesis
Figure 4 for Towards Spontaneous Style Modeling with Semi-supervised Pre-training for Conversational Text-to-Speech Synthesis
Viaarxiv icon

MSStyleTTS: Multi-Scale Style Modeling with Hierarchical Context Information for Expressive Speech Synthesis

Add code
Jul 29, 2023
Figure 1 for MSStyleTTS: Multi-Scale Style Modeling with Hierarchical Context Information for Expressive Speech Synthesis
Figure 2 for MSStyleTTS: Multi-Scale Style Modeling with Hierarchical Context Information for Expressive Speech Synthesis
Figure 3 for MSStyleTTS: Multi-Scale Style Modeling with Hierarchical Context Information for Expressive Speech Synthesis
Figure 4 for MSStyleTTS: Multi-Scale Style Modeling with Hierarchical Context Information for Expressive Speech Synthesis
Viaarxiv icon

GTN-Bailando: Genre Consistent Long-Term 3D Dance Generation based on Pre-trained Genre Token Network

Add code
Apr 25, 2023
Figure 1 for GTN-Bailando: Genre Consistent Long-Term 3D Dance Generation based on Pre-trained Genre Token Network
Figure 2 for GTN-Bailando: Genre Consistent Long-Term 3D Dance Generation based on Pre-trained Genre Token Network
Figure 3 for GTN-Bailando: Genre Consistent Long-Term 3D Dance Generation based on Pre-trained Genre Token Network
Figure 4 for GTN-Bailando: Genre Consistent Long-Term 3D Dance Generation based on Pre-trained Genre Token Network
Viaarxiv icon