Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Alberto Bernardini

A Zero-Shot Physics-Informed Dictionary Learning Approach for Sound Field Reconstruction

Dec 24, 2024

Stefano Damiano, Federico Miotello, Mirco Pezzoli, Alberto Bernardini, Fabio Antonacci, Augusto Sarti, Toon van Waterschoot

Figure 1 for A Zero-Shot Physics-Informed Dictionary Learning Approach for Sound Field Reconstruction

Figure 2 for A Zero-Shot Physics-Informed Dictionary Learning Approach for Sound Field Reconstruction

Figure 3 for A Zero-Shot Physics-Informed Dictionary Learning Approach for Sound Field Reconstruction

Abstract:Sound field reconstruction aims to estimate pressure fields in areas lacking direct measurements. Existing techniques often rely on strong assumptions or face challenges related to data availability or the explicit modeling of physical properties. To bridge these gaps, this study introduces a zero-shot, physics-informed dictionary learning approach to perform sound field reconstruction. Our method relies only on a few sparse measurements to learn a dictionary, without the need for additional training data. Moreover, by enforcing the Helmholtz equation during the optimization process, the proposed approach ensures that the reconstructed sound field is represented as a linear combination of a few physically meaningful atoms. Evaluations on real-world data show that our approach achieves comparable performance to state-of-the-art dictionary learning techniques, with the advantage of requiring only a few observations of the sound field and no training on a dataset.

* Accepted for publication at ICASSP 2025

Via

Access Paper or Ask Questions

The IEEE-IS2 2024 Music Packet Loss Concealment Challenge

Sep 27, 2024

Alessandro Ilic Mezza, Alberto Bernardini

Figure 1 for The IEEE-IS2 2024 Music Packet Loss Concealment Challenge

Figure 2 for The IEEE-IS2 2024 Music Packet Loss Concealment Challenge

Figure 3 for The IEEE-IS2 2024 Music Packet Loss Concealment Challenge

Figure 4 for The IEEE-IS2 2024 Music Packet Loss Concealment Challenge

Abstract:We present the IEEE-IS2 2024 Music Packet Loss Concealment Challenge. We begin by detailing the challenge rules, followed by an overview of the provided baseline system, the blind test set, and the evaluation methodology used to determine the final ranking. This inaugural edition aimed to foster collaboration between researchers and practitioners from the fields of signal processing, machine learning, and networked music performance, while also laying the groundwork for future advancements in packet loss concealment for music signals.

* 8 pages, 4 figures, 3 tables. Official report of the IEEE-IS2 2024 Music Packet Loss Concealment Challenge, part of the 2nd International Workshop on Networked Immersive Audio

Via

Access Paper or Ask Questions

A Physics-Informed Neural Network-Based Approach for the Spatial Upsampling of Spherical Microphone Arrays

Jul 26, 2024

Federico Miotello, Ferdinando Terminiello, Mirco Pezzoli, Alberto Bernardini, Fabio Antonacci, Augusto Sarti

Figure 1 for A Physics-Informed Neural Network-Based Approach for the Spatial Upsampling of Spherical Microphone Arrays

Figure 2 for A Physics-Informed Neural Network-Based Approach for the Spatial Upsampling of Spherical Microphone Arrays

Abstract:Spherical microphone arrays are convenient tools for capturing the spatial characteristics of a sound field. However, achieving superior spatial resolution requires arrays with numerous capsules, consequently leading to expensive devices. To address this issue, we present a method for spatially upsampling spherical microphone arrays with a limited number of capsules. Our approach exploits a physics-informed neural network with Rowdy activation functions, leveraging physical constraints to provide high-order microphone array signals, starting from low-order devices. Results show that, within its domain of application, our approach outperforms a state of the art method based on signal processing for spherical microphone arrays upsampling.

* Accepted for publication at IWAENC 2024

Via

Access Paper or Ask Questions

Data-Driven Room Acoustic Modeling Via Differentiable Feedback Delay Networks With Learnable Delay Lines

Mar 29, 2024

Alessandro Ilic Mezza, Riccardo Giampiccolo, Enzo De Sena, Alberto Bernardini

Abstract:Over the past few decades, extensive research has been devoted to the design of artificial reverberation algorithms aimed at emulating the room acoustics of physical environments. Despite significant advancements, automatic parameter tuning of delay-network models remains an open challenge. We introduce a novel method for finding the parameters of a Feedback Delay Network (FDN) such that its output renders the perceptual qualities of a measured room impulse response. The proposed approach involves the implementation of a differentiable FDN with trainable delay lines, which, for the first time, allows us to simultaneously learn each and every delay-network parameter via backpropagation. The iterative optimization process seeks to minimize a time-domain loss function incorporating differentiable terms accounting for energy decay and echo density. Through experimental validation, we show that the proposed method yields time-invariant frequency-independent FDNs capable of closely matching the desired acoustical characteristics, and outperforms existing methods based on genetic algorithms and analytical filter design.

* The article has been submitted to EURASIP Journal on Audio, Speech, and Music Processing on Jan 02, 2024 and is currently under review

Via

Access Paper or Ask Questions

HOMULA-RIR: A Room Impulse Response Dataset for Teleconferencing and Spatial Audio Applications Acquired Through Higher-Order Microphones and Uniform Linear Microphone Arrays

Feb 21, 2024

Federico Miotello, Paolo Ostan, Mirco Pezzoli, Luca Comanducci, Alberto Bernardini, Fabio Antonacci, Augusto Sarti

Figure 1 for HOMULA-RIR: A Room Impulse Response Dataset for Teleconferencing and Spatial Audio Applications Acquired Through Higher-Order Microphones and Uniform Linear Microphone Arrays

Figure 2 for HOMULA-RIR: A Room Impulse Response Dataset for Teleconferencing and Spatial Audio Applications Acquired Through Higher-Order Microphones and Uniform Linear Microphone Arrays

Figure 3 for HOMULA-RIR: A Room Impulse Response Dataset for Teleconferencing and Spatial Audio Applications Acquired Through Higher-Order Microphones and Uniform Linear Microphone Arrays

Figure 4 for HOMULA-RIR: A Room Impulse Response Dataset for Teleconferencing and Spatial Audio Applications Acquired Through Higher-Order Microphones and Uniform Linear Microphone Arrays

Abstract:In this paper, we present HOMULA-RIR, a dataset of room impulse responses (RIRs) acquired using both higher-order microphones (HOMs) and a uniform linear array (ULA), in order to model a remote attendance teleconferencing scenario. Specifically, measurements were performed in a seminar room, where a 64-microphone ULA was used as a multichannel audio acquisition system in the proximity of the speakers, while HOMs were used to model 25 attendees actually present in the seminar room. The HOMs cover a wide area of the room, making the dataset suitable also for applications of virtual acoustics. Through the measurement of the reverberation time and clarity index, and sample applications such as source localization and separation, we demonstrate the effectiveness of the HOMULA-RIR dataset.

* Accepted for publication at ICASSP 2024 - HSCMA Workshop

Via

Access Paper or Ask Questions

Toward Deep Drum Source Separation

Dec 15, 2023

Alessandro Ilic Mezza, Riccardo Giampiccolo, Alberto Bernardini, Augusto Sarti

Figure 1 for Toward Deep Drum Source Separation

Figure 2 for Toward Deep Drum Source Separation

Figure 3 for Toward Deep Drum Source Separation

Figure 4 for Toward Deep Drum Source Separation

Abstract:In the past, the field of drum source separation faced significant challenges due to limited data availability, hindering the adoption of cutting-edge deep learning methods that have found success in other related audio applications. In this manuscript, we introduce StemGMD, a large-scale audio dataset of isolated single-instrument drum stems. Each audio clip is synthesized from MIDI recordings of expressive drums performances using ten real-sounding acoustic drum kits. Totaling 1224 hours, StemGMD is the largest audio dataset of drums to date and the first to comprise isolated audio clips for every instrument in a canonical nine-piece drum kit. We leverage StemGMD to develop LarsNet, a novel deep drum source separation model. Through a bank of dedicated U-Nets, LarsNet can separate five stems from a stereo drum mixture faster than real-time and is shown to significantly outperform state-of-the-art nonnegative spectro-temporal factorization methods.

* 9 pages, 2 figures. Submitted to Pattern Recognition Letters

Via

Access Paper or Ask Questions

Reconstruction of Sound Field through Diffusion Models

Dec 14, 2023

Federico Miotello, Luca Comanducci, Mirco Pezzoli, Alberto Bernardini, Fabio Antonacci, Augusto Sarti

Figure 1 for Reconstruction of Sound Field through Diffusion Models

Figure 2 for Reconstruction of Sound Field through Diffusion Models

Abstract:Reconstructing the sound field in a room is an important task for several applications, such as sound control and augmented (AR) or virtual reality (VR). In this paper, we propose a data-driven generative model for reconstructing the magnitude of acoustic fields in rooms with a focus on the modal frequency range. We introduce, for the first time, the use of a conditional Denoising Diffusion Probabilistic Model (DDPM) trained in order to reconstruct the sound field (SF-Diff) over an extended domain. The architecture is devised in order to be conditioned on a set of limited available measurements at different frequencies and generate the sound field in target, unknown, locations. The results show that SF-Diff is able to provide accurate reconstructions, outperforming a state-of-the-art baseline based on kernel interpolation.

* Accepted for publication at ICASSP 2024

Via

Access Paper or Ask Questions