music generation


Music generation is the task of generating music or music-like sounds from a model or algorithm.

A Reproducible, Scalable Pipeline for Synthesizing Autoregressive Model Literature

Add code
Aug 06, 2025
Viaarxiv icon

How Animals Dance (When You're Not Looking)

Add code
May 29, 2025
Figure 1 for How Animals Dance (When You're Not Looking)
Figure 2 for How Animals Dance (When You're Not Looking)
Figure 3 for How Animals Dance (When You're Not Looking)
Figure 4 for How Animals Dance (When You're Not Looking)
Viaarxiv icon

SmoothSinger: A Conditional Diffusion Model for Singing Voice Synthesis with Multi-Resolution Architecture

Add code
Jun 26, 2025
Viaarxiv icon

CMI-Bench: A Comprehensive Benchmark for Evaluating Music Instruction Following

Add code
Jun 14, 2025
Viaarxiv icon

SonicMotion: Dynamic Spatial Audio Soundscapes with Latent Diffusion Models

Add code
Jul 09, 2025
Figure 1 for SonicMotion: Dynamic Spatial Audio Soundscapes with Latent Diffusion Models
Figure 2 for SonicMotion: Dynamic Spatial Audio Soundscapes with Latent Diffusion Models
Figure 3 for SonicMotion: Dynamic Spatial Audio Soundscapes with Latent Diffusion Models
Viaarxiv icon

Spatial-Temporal Graph Mamba for Music-Guided Dance Video Synthesis

Add code
Jul 09, 2025
Figure 1 for Spatial-Temporal Graph Mamba for Music-Guided Dance Video Synthesis
Figure 2 for Spatial-Temporal Graph Mamba for Music-Guided Dance Video Synthesis
Figure 3 for Spatial-Temporal Graph Mamba for Music-Guided Dance Video Synthesis
Figure 4 for Spatial-Temporal Graph Mamba for Music-Guided Dance Video Synthesis
Viaarxiv icon

Quantize More, Lose Less: Autoregressive Generation from Residually Quantized Speech Representations

Add code
Jul 16, 2025
Figure 1 for Quantize More, Lose Less: Autoregressive Generation from Residually Quantized Speech Representations
Figure 2 for Quantize More, Lose Less: Autoregressive Generation from Residually Quantized Speech Representations
Figure 3 for Quantize More, Lose Less: Autoregressive Generation from Residually Quantized Speech Representations
Figure 4 for Quantize More, Lose Less: Autoregressive Generation from Residually Quantized Speech Representations
Viaarxiv icon

Audio-Sync Video Generation with Multi-Stream Temporal Control

Add code
Jun 09, 2025
Figure 1 for Audio-Sync Video Generation with Multi-Stream Temporal Control
Figure 2 for Audio-Sync Video Generation with Multi-Stream Temporal Control
Figure 3 for Audio-Sync Video Generation with Multi-Stream Temporal Control
Figure 4 for Audio-Sync Video Generation with Multi-Stream Temporal Control
Viaarxiv icon

Do Music Preferences Reflect Cultural Values? A Cross-National Analysis Using Music Embedding and World Values Survey

Add code
Jun 16, 2025
Viaarxiv icon

DeSTA2.5-Audio: Toward General-Purpose Large Audio Language Model with Self-Generated Cross-Modal Alignment

Add code
Jul 03, 2025
Viaarxiv icon