Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Pedro Vera-Candeas

Learning Input-Channel Permutation Equivariance for Multi-Channel Source Separation: Reducing Bleeding in Small Music Ensembles

Jun 15, 2026

Ruchi Pandey, Jaime Garcia-Martinez, Pablo Cabanas-Molero, David Diaz Guerra, Ricardo Falcon Perez, Tuomas Virtanen, Julio J. Carabias-Orti, Pedro Vera-Candeas

Abstract:Microphone bleed is a persistent challenge in small ensembles and orchestral recordings, where close microphones intended for individual instruments also capture leakage from nearby sources. This overlap degrades track isolation and complicates mixing. This paper addresses the bleeding problem by making channel-permutation-equivariance a core learning principle. During training, we apply the same random permutation to the input microphone channels and their corresponding reference targets. This discourages reliance on fixed channel-instrument associations and improves robustness to changes in the recording setup and even in the recorded instruments. The proposed model is trained on synthetic ensembles with diverse simulated room acoustics and microphone placements, and evaluated on unseen simulated conditions and real URMP recordings. The results show that permutation-aware training consistently improves SDR and reduces bleeding under unseen conditions compared with non-permutation baselines. The findings highlight permutation-equivariance as a simple, data-centric strategy for robust debleeding and practical multi-channel source separation in music production workflows.

Via

Access Paper or Ask Questions

SynthSOD: Developing an Heterogeneous Dataset for Orchestra Music Source Separation

Sep 17, 2024

Jaime Garcia-Martinez, David Diaz-Guerra, Archontis Politis, Tuomas Virtanen, Julio J. Carabias-Orti, Pedro Vera-Candeas

Figure 1 for SynthSOD: Developing an Heterogeneous Dataset for Orchestra Music Source Separation

Figure 2 for SynthSOD: Developing an Heterogeneous Dataset for Orchestra Music Source Separation

Figure 3 for SynthSOD: Developing an Heterogeneous Dataset for Orchestra Music Source Separation

Figure 4 for SynthSOD: Developing an Heterogeneous Dataset for Orchestra Music Source Separation

Abstract:Recent advancements in music source separation have significantly progressed, particularly in isolating vocals, drums, and bass elements from mixed tracks. These developments owe much to the creation and use of large-scale, multitrack datasets dedicated to these specific components. However, the challenge of extracting similarly sounding sources from orchestra recordings has not been extensively explored, largely due to a scarcity of comprehensive and clean (i.e bleed-free) multitrack datasets. In this paper, we introduce a novel multitrack dataset called SynthSOD, developed using a set of simulation techniques to create a realistic (i.e. using high-quality soundfonts), musically motivated, and heterogeneous training set comprising different dynamics, natural tempo changes, styles, and conditions. Moreover, we demonstrate the application of a widely used baseline music separation model trained on our synthesized dataset w.r.t to the well-known EnsembleSet, and evaluate its performance under both synthetic and real-world conditions.

* Submitted to the OJSP - ICASSP 2025

Via

Access Paper or Ask Questions

Pre-trained Spatial Priors on Multichannel NMF for Music Source Separation

Oct 09, 2023

Pablo Cabanas-Molero, Antonio J. Munoz-Montoro, Julio Carabias-Orti, Pedro Vera-Candeas

Abstract:This paper presents a novel approach to sound source separation that leverages spatial information obtained during the recording setup. Our method trains a spatial mixing filter using solo passages to capture information about the room impulse response and transducer response at each sensor location. This pre-trained filter is then integrated into a multichannel non-negative matrix factorization (MNMF) scheme to better capture the variances of different sound sources. The recording setup used in our experiments is the typical setup for orchestra recordings, with a main microphone and a close "cardioid" or "supercardioid" microphone for each section of the orchestra. This makes the proposed method applicable to many existing recordings. Experiments on polyphonic ensembles demonstrate the effectiveness of the proposed framework in separating individual sound sources, improving performance compared to conventional MNMF methods.

* Accepted for publication at Forum Acusticum 2023

Via

Access Paper or Ask Questions