Abstract:In a wireless acoustic sensor network (WASN), devices (i.e., nodes) can collaborate through distributed algorithms to collectively perform audio signal processing tasks. This paper focuses on the distributed estimation of node-specific desired speech signals using network-wide Wiener filtering. The objective is to match the performance of a centralized system that would have access to all microphone signals, while reducing the communication bandwidth usage of the algorithm. Existing solutions, such as the distributed adaptive node-specific signal estimation (DANSE) algorithm, converge towards the multichannel Wiener filter (MWF) which solves a centralized linear minimum mean square error (LMMSE) signal estimation problem. However, they do so iteratively, which can be slow and impractical. Many solutions also assume that all nodes observe the same set of sources of interest, which is often not the case in practice. To overcome these limitations, we propose the distributed multichannel Wiener filter (dMWF) for fully connected WASNs. The dMWF is non-iterative and optimal even when nodes observe different sets of sources. In this algorithm, nodes exchange neighbor-pair-specific, low-dimensional (fused) signals estimating the contribution of sources observed by both nodes in the pair. We formally prove the optimality of dMWF and demonstrate its performance in simulated speech enhancement experiments. The proposed algorithm is shown to outperform DANSE in terms of objective metrics after short operation times, highlighting the benefit of its iterationless design.
Abstract:Virtual and augmented realities are increasingly popular tools in many domains such as architecture, production, training and education, (psycho)therapy, gaming, and others. For a convincing rendering of sound in virtual and augmented environments, audio signals must be convolved in real-time with impulse responses that change from one moment in time to another. Key requirements for the implementation of such time-variant real-time convolution algorithms are short latencies, moderate computational cost and memory footprint, and no perceptible switching artifacts. In this engineering report, we introduce a partitioned convolution algorithm that is able to quickly switch between impulse responses without introducing perceptible artifacts, while maintaining a constant computational load and low memory usage. Implementations in several popular programming languages are freely available via GitHub.