Picture for Lichao Zhang

Lichao Zhang

Accompanied Singing Voice Synthesis with Fully Text-controlled Melody

Add code
Jul 02, 2024
Figure 1 for Accompanied Singing Voice Synthesis with Fully Text-controlled Melody
Figure 2 for Accompanied Singing Voice Synthesis with Fully Text-controlled Melody
Figure 3 for Accompanied Singing Voice Synthesis with Fully Text-controlled Melody
Figure 4 for Accompanied Singing Voice Synthesis with Fully Text-controlled Melody
Viaarxiv icon

Unveiling the Impact of Multi-Modal Interactions on User Engagement: A Comprehensive Evaluation in AI-driven Conversations

Add code
Jun 21, 2024
Figure 1 for Unveiling the Impact of Multi-Modal Interactions on User Engagement: A Comprehensive Evaluation in AI-driven Conversations
Figure 2 for Unveiling the Impact of Multi-Modal Interactions on User Engagement: A Comprehensive Evaluation in AI-driven Conversations
Figure 3 for Unveiling the Impact of Multi-Modal Interactions on User Engagement: A Comprehensive Evaluation in AI-driven Conversations
Figure 4 for Unveiling the Impact of Multi-Modal Interactions on User Engagement: A Comprehensive Evaluation in AI-driven Conversations
Viaarxiv icon

Quality and Quantity: Unveiling a Million High-Quality Images for Text-to-Image Synthesis in Fashion Design

Add code
Nov 29, 2023
Figure 1 for Quality and Quantity: Unveiling a Million High-Quality Images for Text-to-Image Synthesis in Fashion Design
Figure 2 for Quality and Quantity: Unveiling a Million High-Quality Images for Text-to-Image Synthesis in Fashion Design
Figure 3 for Quality and Quantity: Unveiling a Million High-Quality Images for Text-to-Image Synthesis in Fashion Design
Figure 4 for Quality and Quantity: Unveiling a Million High-Quality Images for Text-to-Image Synthesis in Fashion Design
Viaarxiv icon

Efficient Human-AI Coordination via Preparatory Language-based Convention

Add code
Nov 01, 2023
Figure 1 for Efficient Human-AI Coordination via Preparatory Language-based Convention
Figure 2 for Efficient Human-AI Coordination via Preparatory Language-based Convention
Figure 3 for Efficient Human-AI Coordination via Preparatory Language-based Convention
Figure 4 for Efficient Human-AI Coordination via Preparatory Language-based Convention
Viaarxiv icon

Tailored Visions: Enhancing Text-to-Image Generation with Personalized Prompt Rewriting

Add code
Oct 12, 2023
Figure 1 for Tailored Visions: Enhancing Text-to-Image Generation with Personalized Prompt Rewriting
Figure 2 for Tailored Visions: Enhancing Text-to-Image Generation with Personalized Prompt Rewriting
Figure 3 for Tailored Visions: Enhancing Text-to-Image Generation with Personalized Prompt Rewriting
Figure 4 for Tailored Visions: Enhancing Text-to-Image Generation with Personalized Prompt Rewriting
Viaarxiv icon

DisCover: Disentangled Music Representation Learning for Cover Song Identification

Add code
Jul 19, 2023
Figure 1 for DisCover: Disentangled Music Representation Learning for Cover Song Identification
Figure 2 for DisCover: Disentangled Music Representation Learning for Cover Song Identification
Figure 3 for DisCover: Disentangled Music Representation Learning for Cover Song Identification
Figure 4 for DisCover: Disentangled Music Representation Learning for Cover Song Identification
Viaarxiv icon

AV-TranSpeech: Audio-Visual Robust Speech-to-Speech Translation

Add code
May 24, 2023
Figure 1 for AV-TranSpeech: Audio-Visual Robust Speech-to-Speech Translation
Figure 2 for AV-TranSpeech: Audio-Visual Robust Speech-to-Speech Translation
Figure 3 for AV-TranSpeech: Audio-Visual Robust Speech-to-Speech Translation
Figure 4 for AV-TranSpeech: Audio-Visual Robust Speech-to-Speech Translation
Viaarxiv icon

AlignSTS: Speech-to-Singing Conversion via Cross-Modal Alignment

Add code
May 24, 2023
Figure 1 for AlignSTS: Speech-to-Singing Conversion via Cross-Modal Alignment
Figure 2 for AlignSTS: Speech-to-Singing Conversion via Cross-Modal Alignment
Figure 3 for AlignSTS: Speech-to-Singing Conversion via Cross-Modal Alignment
Figure 4 for AlignSTS: Speech-to-Singing Conversion via Cross-Modal Alignment
Viaarxiv icon

Learning Robust Self-attention Features for Speech Emotion Recognition with Label-adaptive Mixup

Add code
May 07, 2023
Figure 1 for Learning Robust Self-attention Features for Speech Emotion Recognition with Label-adaptive Mixup
Figure 2 for Learning Robust Self-attention Features for Speech Emotion Recognition with Label-adaptive Mixup
Figure 3 for Learning Robust Self-attention Features for Speech Emotion Recognition with Label-adaptive Mixup
Figure 4 for Learning Robust Self-attention Features for Speech Emotion Recognition with Label-adaptive Mixup
Viaarxiv icon

TranSpeech: Speech-to-Speech Translation With Bilateral Perturbation

Add code
May 25, 2022
Figure 1 for TranSpeech: Speech-to-Speech Translation With Bilateral Perturbation
Figure 2 for TranSpeech: Speech-to-Speech Translation With Bilateral Perturbation
Figure 3 for TranSpeech: Speech-to-Speech Translation With Bilateral Perturbation
Figure 4 for TranSpeech: Speech-to-Speech Translation With Bilateral Perturbation
Viaarxiv icon