music generation


Music generation is the task of generating music or music-like sounds from a model or algorithm.

SonicVerse: Multi-Task Learning for Music Feature-Informed Captioning

Add code
Jun 18, 2025
Viaarxiv icon

SonicMotion: Dynamic Spatial Audio Soundscapes with Latent Diffusion Models

Add code
Jul 09, 2025
Figure 1 for SonicMotion: Dynamic Spatial Audio Soundscapes with Latent Diffusion Models
Figure 2 for SonicMotion: Dynamic Spatial Audio Soundscapes with Latent Diffusion Models
Figure 3 for SonicMotion: Dynamic Spatial Audio Soundscapes with Latent Diffusion Models
Viaarxiv icon

Quantize More, Lose Less: Autoregressive Generation from Residually Quantized Speech Representations

Add code
Jul 16, 2025
Figure 1 for Quantize More, Lose Less: Autoregressive Generation from Residually Quantized Speech Representations
Figure 2 for Quantize More, Lose Less: Autoregressive Generation from Residually Quantized Speech Representations
Figure 3 for Quantize More, Lose Less: Autoregressive Generation from Residually Quantized Speech Representations
Figure 4 for Quantize More, Lose Less: Autoregressive Generation from Residually Quantized Speech Representations
Viaarxiv icon

MGE-LDM: Joint Latent Diffusion for Simultaneous Music Generation and Source Extraction

Add code
May 29, 2025
Figure 1 for MGE-LDM: Joint Latent Diffusion for Simultaneous Music Generation and Source Extraction
Figure 2 for MGE-LDM: Joint Latent Diffusion for Simultaneous Music Generation and Source Extraction
Figure 3 for MGE-LDM: Joint Latent Diffusion for Simultaneous Music Generation and Source Extraction
Figure 4 for MGE-LDM: Joint Latent Diffusion for Simultaneous Music Generation and Source Extraction
Viaarxiv icon

Adaptive Accompaniment with ReaLchords

Add code
Jun 17, 2025
Viaarxiv icon

LEGATO: Large-scale End-to-end Generalizable Approach to Typeset OMR

Add code
Jun 23, 2025
Viaarxiv icon

Towards Video to Piano Music Generation with Chain-of-Perform Support Benchmarks

Add code
May 26, 2025
Figure 1 for Towards Video to Piano Music Generation with Chain-of-Perform Support Benchmarks
Figure 2 for Towards Video to Piano Music Generation with Chain-of-Perform Support Benchmarks
Viaarxiv icon

Text2midi-InferAlign: Improving Symbolic Music Generation with Inference-Time Alignment

Add code
May 19, 2025
Figure 1 for Text2midi-InferAlign: Improving Symbolic Music Generation with Inference-Time Alignment
Figure 2 for Text2midi-InferAlign: Improving Symbolic Music Generation with Inference-Time Alignment
Figure 3 for Text2midi-InferAlign: Improving Symbolic Music Generation with Inference-Time Alignment
Figure 4 for Text2midi-InferAlign: Improving Symbolic Music Generation with Inference-Time Alignment
Viaarxiv icon

SmoothSinger: A Conditional Diffusion Model for Singing Voice Synthesis with Multi-Resolution Architecture

Add code
Jun 26, 2025
Viaarxiv icon

OpenDance: Multimodal Controllable 3D Dance Generation Using Large-scale Internet Data

Add code
Jun 09, 2025
Viaarxiv icon