Abstract:The robustness of deep neural networks (DNNs) can be certified through their Lipschitz continuity, which has made the construction of Lipschitz-continuous DNNs an active research field. However, DNNs for audio processing have not been a major focus due to their poor compatibility with existing results. In this paper, we consider the amplitude modifier (AM), a popular architecture for handling audio signals, and propose its Lipschitz-continuous variants, which we refer to as LipsAM. We prove a sufficient condition for an AM to be Lipschitz continuous and propose two architectures as examples of LipsAM. The proposed architectures were applied to a Plug-and-Play algorithm for speech dereverberation, and their improved stability is demonstrated through numerical experiments.
Abstract:Solving the permutation problem is essential for determined blind source separation (BSS). Existing methods, such as independent vector analysis (IVA) and independent low-rank matrix analysis (ILRMA), tackle the permutation problem by modeling the co-occurrence of the frequency components of source signals. One of the remaining challenges in these methods is the block permutation problem, which may lead to poor separation results. In this paper, we propose a simple and effective technique for solving the block permutation problem. The proposed technique splits the entire frequencies into overlapping subbands and sequentially applies a BSS method (e.g., IVA, ILRMA, or any other method) to each subband. Since the problem size is reduced by the splitting, the BSS method can effectively work in each subband. Then, the permutations between the subbands are aligned by using the separation result in one subband as the initial values for the other subbands. Experimental results showed that the proposed technique remarkably improved the separation performance without increasing the total computational cost.