Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jordi Janer

ChoralSynth: Synthetic Dataset of Choral Singing

Nov 21, 2023

Jyoti Narang, Viviana De La Vega, Xavier Lizarraga, Oscar Mayor, Hector Parra, Jordi Janer, Xavier Serra

Figure 1 for ChoralSynth: Synthetic Dataset of Choral Singing

Figure 2 for ChoralSynth: Synthetic Dataset of Choral Singing

Figure 3 for ChoralSynth: Synthetic Dataset of Choral Singing

Figure 4 for ChoralSynth: Synthetic Dataset of Choral Singing

Abstract:Choral singing, a widely practiced form of ensemble singing, lacks comprehensive datasets in the realm of Music Information Retrieval (MIR) research, due to challenges arising from the requirement to curate multitrack recordings. To address this, we devised a novel methodology, leveraging state-of-the-art synthesizers to create and curate quality renditions. The scores were sourced from Choral Public Domain Library(CPDL). This work is done in collaboration with a diverse team of musicians, software engineers and researchers. The resulting dataset, complete with its associated metadata, and methodology is released as part of this work, opening up new avenues for exploration and advancement in the field of singing voice research.

* Dataset Link: https://doi.org/10.5281/zenodo.10137883

Via

Access Paper or Ask Questions

Voice conversion with limited data and limitless data augmentations

Dec 27, 2022

Olga Slizovskaia, Jordi Janer, Pritish Chandna, Oscar Mayor

Abstract:Applying changes to an input speech signal to change the perceived speaker of speech to a target while maintaining the content of the input is a challenging but interesting task known as Voice conversion (VC). Over the last few years, this task has gained significant interest where most systems use data-driven machine learning models. Doing the conversion in a low-latency real-world scenario is even more challenging constrained by the availability of high-quality data. Data augmentations such as pitch shifting and noise addition are often used to increase the amount of data used for training machine learning based models for this task. In this paper we explore the efficacy of common data augmentation techniques for real-time voice conversion and introduce novel techniques for data augmentation based on audio and voice transformation effects as well. We evaluate the conversions for both male and female target speakers using objective and subjective evaluation methodologies.

Via

Access Paper or Ask Questions