Music sentiment transfer is a completely novel task. Sentiment transfer is a natural evolution of the heavily-studied style transfer task, as sentiment transfer is rooted in applying the sentiment of a source to be the new sentiment for a target piece of media; yet compared to style transfer, sentiment transfer has been only scantily studied on images. Music sentiment transfer attempts to apply the high level objective of sentiment transfer to the domain of music. We propose CycleGAN to bridge disparate domains. In order to use the network, we choose to use symbolic, MIDI, data as the music format. Through the use of a cycle consistency loss, we are able to create one-to-one mappings that preserve the content and realism of the source data. Results and literature suggest that the task of music sentiment transfer is more difficult than image sentiment transfer because of the temporal characteristics of music and lack of existing datasets.
We explore the feasibility of using triplet neural networks to embed songs based on content-based music similarity. Our network is trained using triplets of songs such that two songs by the same artist are embedded closer to one another than to a third song by a different artist. We compare two models that are trained using different ways of picking this third song: at random vs. based on shared genre labels. Our experiments are conducted using songs from the Free Music Archive and use standard audio features. The initial results show that shallow Siamese networks can be used to embed music for a simple artist retrieval task.