In high sample-rate applications of the least-mean-square (LMS) adaptive filtering algorithm, pipelining or/and block processing is required. In this paper, a stochastic analysis of the delayed block LMS algorithm is presented. As opposed to earlier work, pipelining and block processing are jointly considered and extensively examined. Different analyses for the steady and transient states to estimate the step-size bound, adaptation accuracy and adaptation speed based on the recursive relation of delayed block excess mean square error (MSE) are presented. The effect of different amounts of pipelining delays and block sizes on the adaptation accuracy and speed of the adaptive filter with different filter taps and speed-ups are studied. It is concluded that for a constant speed-up, a large delay and small block size lead to a slower convergence rate compared to a small delay and large block size with almost the same steady-state MSE. Monte Carlo simulations indicate a fairly good agreement with the proposed estimates for Gaussian inputs.
In this paper, new insights in frequency-domain implementations of digital finite-length impulse response filtering (linear convolution) using overlap-add and overlap-save techniques are provided. It is shown that, in practical finite-wordlength implementations, the overall system corresponds to a time-varying system that can be represented in essentially two different ways. One way is to represent the system with a distortion function and aliasing functions, which in this paper is derived from multirate filter bank representations. The other way is to use a periodically time-varying impulse-response representation or, equivalently, a set of time-invariant impulse responses and the corresponding frequency responses. The paper provides systematic derivations and analyses of these representations along with filter impulse response properties and design examples. The representations are particularly useful when analyzing the effect of coefficient quantizations as well as the use of shorter DFT lengths than theoretically required. A comprehensive computational-complexity analysis is also provided, and accurate formulas for estimating the optimal DFT lengths for given filter lengths are derived. Using optimal DFT lengths, it is shown that the frequency-domain implementations have lower computational complexities (multiplication rates) than the corresponding time-domain implementations for filter lengths that are shorter than those reported earlier in the literature. In particular, for general (unsymmetric) filters, the frequency-domain implementations are shown to be more efficient for all filter lengths. This opens up for new considerations when comparing complexities of different filter implementations.