Picture for Max W. Y. Lam

Max W. Y. Lam

Analyzable Chain-of-Musical-Thought Prompting for High-Fidelity Music Generation

Add code
Mar 25, 2025
Viaarxiv icon

SongCreator: Lyrics-based Universal Song Generation

Add code
Sep 09, 2024
Figure 1 for SongCreator: Lyrics-based Universal Song Generation
Figure 2 for SongCreator: Lyrics-based Universal Song Generation
Figure 3 for SongCreator: Lyrics-based Universal Song Generation
Figure 4 for SongCreator: Lyrics-based Universal Song Generation
Viaarxiv icon

Foundation Models for Music: A Survey

Add code
Aug 27, 2024
Figure 1 for Foundation Models for Music: A Survey
Figure 2 for Foundation Models for Music: A Survey
Figure 3 for Foundation Models for Music: A Survey
Figure 4 for Foundation Models for Music: A Survey
Viaarxiv icon

Diverse and Expressive Speech Prosody Prediction with Denoising Diffusion Probabilistic Model

Add code
May 26, 2023
Viaarxiv icon

Efficient Neural Music Generation

Add code
May 25, 2023
Figure 1 for Efficient Neural Music Generation
Figure 2 for Efficient Neural Music Generation
Figure 3 for Efficient Neural Music Generation
Figure 4 for Efficient Neural Music Generation
Viaarxiv icon

FastDiff: A Fast Conditional Diffusion Model for High-Quality Speech Synthesis

Add code
Apr 21, 2022
Figure 1 for FastDiff: A Fast Conditional Diffusion Model for High-Quality Speech Synthesis
Figure 2 for FastDiff: A Fast Conditional Diffusion Model for High-Quality Speech Synthesis
Figure 3 for FastDiff: A Fast Conditional Diffusion Model for High-Quality Speech Synthesis
Figure 4 for FastDiff: A Fast Conditional Diffusion Model for High-Quality Speech Synthesis
Viaarxiv icon

BDDM: Bilateral Denoising Diffusion Models for Fast and High-Quality Speech Synthesis

Add code
Mar 25, 2022
Figure 1 for BDDM: Bilateral Denoising Diffusion Models for Fast and High-Quality Speech Synthesis
Figure 2 for BDDM: Bilateral Denoising Diffusion Models for Fast and High-Quality Speech Synthesis
Figure 3 for BDDM: Bilateral Denoising Diffusion Models for Fast and High-Quality Speech Synthesis
Figure 4 for BDDM: Bilateral Denoising Diffusion Models for Fast and High-Quality Speech Synthesis
Viaarxiv icon

Bilateral Denoising Diffusion Models

Add code
Aug 31, 2021
Figure 1 for Bilateral Denoising Diffusion Models
Figure 2 for Bilateral Denoising Diffusion Models
Figure 3 for Bilateral Denoising Diffusion Models
Figure 4 for Bilateral Denoising Diffusion Models
Viaarxiv icon

Raw Waveform Encoder with Multi-Scale Globally Attentive Locally Recurrent Networks for End-to-End Speech Recognition

Add code
Jun 08, 2021
Figure 1 for Raw Waveform Encoder with Multi-Scale Globally Attentive Locally Recurrent Networks for End-to-End Speech Recognition
Figure 2 for Raw Waveform Encoder with Multi-Scale Globally Attentive Locally Recurrent Networks for End-to-End Speech Recognition
Figure 3 for Raw Waveform Encoder with Multi-Scale Globally Attentive Locally Recurrent Networks for End-to-End Speech Recognition
Figure 4 for Raw Waveform Encoder with Multi-Scale Globally Attentive Locally Recurrent Networks for End-to-End Speech Recognition
Viaarxiv icon

Sandglasset: A Light Multi-Granularity Self-attentive Network For Time-Domain Speech Separation

Add code
Mar 08, 2021
Figure 1 for Sandglasset: A Light Multi-Granularity Self-attentive Network For Time-Domain Speech Separation
Figure 2 for Sandglasset: A Light Multi-Granularity Self-attentive Network For Time-Domain Speech Separation
Figure 3 for Sandglasset: A Light Multi-Granularity Self-attentive Network For Time-Domain Speech Separation
Figure 4 for Sandglasset: A Light Multi-Granularity Self-attentive Network For Time-Domain Speech Separation
Viaarxiv icon