Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Nao Tokui

Neutone SDK: An Open Source Framework for Neural Audio Processing

Aug 12, 2025

Christopher Mitcheltree, Bogdan Teleaga, Andrew Fyfe, Naotake Masuda, Matthias Schäfer, Alfie Bradic, Nao Tokui

Abstract:Neural audio processing has unlocked novel methods of sound transformation and synthesis, yet integrating deep learning models into digital audio workstations (DAWs) remains challenging due to real-time / neural network inference constraints and the complexities of plugin development. In this paper, we introduce the Neutone SDK: an open source framework that streamlines the deployment of PyTorch-based neural audio models for both real-time and offline applications. By encapsulating common challenges such as variable buffer sizes, sample rate conversion, delay compensation, and control parameter handling within a unified, model-agnostic interface, our framework enables seamless interoperability between neural models and host plugins while allowing users to work entirely in Python. We provide a technical overview of the interfaces needed to accomplish this, as well as the corresponding SDK implementations. We also demonstrate the SDK's versatility across applications such as audio effect emulation, timbre transfer, and sample generation, as well as its adoption by researchers, educators, companies, and artists alike. The Neutone SDK is available at https://github.com/Neutone/neutone_sdk

* Accepted to AES International Conference on Artificial Intelligence and Machine Learning for Audio 2025

Via

Access Paper or Ask Questions

MR4MR: Mixed Reality for Melody Reincarnation

Sep 15, 2022

Atsuya Kobayashi, Ryogo Ishino, Ryuku Nobusue, Takumi Inoue, Keisuke Okazaki, Shoma Sawa, Nao Tokui

Figure 1 for MR4MR: Mixed Reality for Melody Reincarnation

Figure 2 for MR4MR: Mixed Reality for Melody Reincarnation

Figure 3 for MR4MR: Mixed Reality for Melody Reincarnation

Figure 4 for MR4MR: Mixed Reality for Melody Reincarnation

Abstract:There is a long history of an effort made to explore musical elements with the entities and spaces around us, such as musique concr\`ete and ambient music. In the context of computer music and digital art, interactive experiences that concentrate on the surrounding objects and physical spaces have also been designed. In recent years, with the development and popularization of devices, an increasing number of works have been designed in Extended Reality to create such musical experiences. In this paper, we describe MR4MR, a sound installation work that allows users to experience melodies produced from interactions with their surrounding space in the context of Mixed Reality (MR). Using HoloLens, an MR head-mounted display, users can bump virtual objects that emit sound against real objects in their surroundings. Then, by continuously creating a melody following the sound made by the object and re-generating randomly and gradually changing melody using music generation machine learning models, users can feel their ambient melody "reincarnating".

* Accepted paper at the 3rd Conference on AI Music Creativity (September 2022)

Via

Access Paper or Ask Questions

Can GAN originate new electronic dance music genres? -- Generating novel rhythm patterns using GAN with Genre Ambiguity Loss

Nov 25, 2020

Nao Tokui

Figure 1 for Can GAN originate new electronic dance music genres? -- Generating novel rhythm patterns using GAN with Genre Ambiguity Loss

Figure 2 for Can GAN originate new electronic dance music genres? -- Generating novel rhythm patterns using GAN with Genre Ambiguity Loss

Figure 3 for Can GAN originate new electronic dance music genres? -- Generating novel rhythm patterns using GAN with Genre Ambiguity Loss

Figure 4 for Can GAN originate new electronic dance music genres? -- Generating novel rhythm patterns using GAN with Genre Ambiguity Loss

Abstract:Since the introduction of deep learning, researchers have proposed content generation systems using deep learning and proved that they are competent to generate convincing content and artistic output, including music. However, one can argue that these deep learning-based systems imitate and reproduce the patterns inherent within what humans have created, instead of generating something new and creative. This paper focuses on music generation, especially rhythm patterns of electronic dance music, and discusses if we can use deep learning to generate novel rhythms, interesting patterns not found in the training dataset. We extend the framework of Generative Adversarial Networks(GAN) and encourage it to diverge from the dataset's inherent distributions by adding additional classifiers to the framework. The paper shows that our proposed GAN can generate rhythm patterns that sound like music rhythms but do not belong to any genres in the training dataset. The source code, generated rhythm patterns, and a supplementary plugin software for a popular Digital Audio Workstation software are available on our website.

Via

Access Paper or Ask Questions

Towards democratizing music production with AI-Design of Variational Autoencoder-based Rhythm Generator as a DAW plugin

Apr 01, 2020

Nao Tokui

Figure 1 for Towards democratizing music production with AI-Design of Variational Autoencoder-based Rhythm Generator as a DAW plugin

Figure 2 for Towards democratizing music production with AI-Design of Variational Autoencoder-based Rhythm Generator as a DAW plugin

Figure 3 for Towards democratizing music production with AI-Design of Variational Autoencoder-based Rhythm Generator as a DAW plugin

Figure 4 for Towards democratizing music production with AI-Design of Variational Autoencoder-based Rhythm Generator as a DAW plugin

Abstract:There has been significant progress in the music generation technique utilizing deep learning. However, it is still hard for musicians and artists to use these techniques in their daily music-making practice. This paper proposes a Variational Autoencoder\cite{Kingma2014}(VAE)-based rhythm generation system, in which musicians can train a deep learning model only by selecting target MIDI files, then generate various rhythms with the model. The author has implemented the system as a plugin software for a DAW (Digital Audio Workstation), namely a Max for Live device for Ableton Live. Selected professional/semi-professional musicians and music producers have used the plugin, and they proved that the plugin is a useful tool for making music creatively. The plugin, source code, and demo videos are available online.

* 4 pages

Via

Access Paper or Ask Questions