5G New Radio (NR) has stringent demands on both performance and complexity for the design of low-density parity-check (LDPC) decoding algorithms and corresponding VLSI implementations. Furthermore, decoders must fully support the wide range of all 5G NR blocklengths and code rates, which is a significant challenge. In this paper, we present a high-performance and low-complexity LDPC decoder, tailor-made to fulfill the 5G requirements. First, to close the gap between belief propagation (BP) decoding and its approximations in hardware, we propose an extension of adjusted min-sum decoding, called generalized adjusted min-sum (GA-MS) decoding. This decoding algorithm flexibly truncates the incoming messages at the check node level and carefully approximates the non-linear functions of BP decoding to balance the error-rate and hardware complexity. Numerical results demonstrate that the proposed fixed-point GAMS has only a minor gap of 0.1 dB compared to floating-point BP under various scenarios of 5G standard specifications. Secondly, we present a fully reconfigurable 5G NR LDPC decoder implementation based on GA-MS decoding. Given that memory occupies a substantial portion of the decoder area, we adopt multiple data compression and approximation techniques to reduce 42.2% of the memory overhead. The corresponding 28nm FD-SOI ASIC decoder has a core area of 1.823 mm2 and operates at 895 MHz. It is compatible with all 5G NR LDPC codes and achieves a peak throughput of 24.42 Gbps and a maximum area efficiency of 13.40 Gbps/mm2 at 4 decoding iterations.
The channel impulse response (CIR) obtained from the channel estimation step of various wireless systems is a widely used source of information in wireless sensing. Breathing rate is one of the important vital signs that can be retrieved from the CIR. Recently, there have been various works that extract the breathing rate from one carefully selected CIR delay bin that contains the breathing information. However, it has also been shown that the accuracy of this estimation is very sensitive to the measurement scenario, e.g., if there is any obstacle between the transceivers and the target, the position of the target, and the orientation of the target, since only one CIR delay bin does not contain a sufficient periodic component to retrieve the breathing rate. We focus on such scenarios and propose a CIR delay bin fusion method to merge several CIR bins to achieve a more accurate and reliable breathing rate estimate. We take measurements and showcase the advantages of the proposed method across scenarios.
Ultra-wideband (UWB) devices are widely used in indoor localization scenarios. Single-anchor UWB localization shows advantages because of its simple system setup compared to conventional two-way ranging (TWR) and trilateration localization methods. In this work, we focus on single-anchor UWB localization methods that learn statistical features of the channel impulse response (CIR) in different location areas using a Gaussian mixture model (GMM). We show that by learning the joint distributions of the amplitudes of different delay components, we achieve a more accurate location estimate compared to considering each delay bin independently. Moreover, we develop a similarity metric between sets of CIRs. With this set-based similarity metric, we can further improve the estimation performance, compared to treating each snapshot separately. We showcase the advantages of the proposed methods in multiple application scenarios.
We describe recursive unique projection-aggregation (RUPA) decoding and iterative unique projection-aggregation (IUPA) decoding of Reed-Muller (RM) codes, which remove non-unique projections from the recursive projection-aggregation (RPA) and iterative projection-aggregation (IPA) algorithms respectively. We show that these algorithms have competitive error-correcting performance while requiring up to 95% projections less than the baseline RPA algorithm.
Some low-complexity LDPC decoders suffer from error floors. We apply iteration-dependent weights to the degree-3 variable nodes to solve this problem. When the 802.3ca EPON LDPC code is considered, an error floor decrease of more than 3 orders of magnitude is achieved.
The soft-aided bit-marking (SABM) algorithm is based on the idea of marking bits as highly reliable bits (HRBs), highly unreliable bits (HUBs), and uncertain bits to improve the performance of hard-decision (HD) decoders. The HRBs and HUBs are used to assist the HD decoders to prevent miscorrections and to decode those originally uncorrectable cases via bit flipping (BF), respectively. In this paper, an improved SABM algorithm (called iSABM) is proposed for staircase codes (SCCs). Similar to the SABM, iSABM marks bits with the help of channel reliabilities, i.e., using the absolute values of the log-likelihood ratios. The improvements offered by iSABM include: (i) HUBs being classified using a reliability threshold, (ii) BF randomly selecting HUBs, and (iii) soft-aided decoding over multiple SCC blocks. The decoding complexity of iSABM is comparable of that of SABM. This is due to the fact that on the one hand no sorting is required (lower complexity) because of the use of a threshold for HUBs, while on the other hand multiple SCC blocks use soft information (higher complexity). Additional gains of up to 0.53 dB with respect to SABM and 0.91 dB with respect to standard SCC decoding at a bit error rate of $10^{-6}$ are reported. Furthermore, it is shown that using 1-bit reliability marking, i.e., only having HRBs and HUBs, only causes a gain penalty of up to 0.25 dB with a significantly reduced memory requirement.
In-band full-duplex systems promise to further increase the throughput of wireless systems, by simultaneously transmitting and receiving on the same frequency band. However, concurrent transmission generates a strong self-interference signal at the receiver, which requires the use of cancellation techniques. A wide range of techniques for analog and digital self-interference cancellation have already been presented in the literature. However, their evaluation focuses on cases where the underlying physical parameters of the full-duplex system do not vary significantly. In this paper, we focus on adaptive digital cancellation, motivated by the fact that physical systems change over time. We examine some of the different cancellation methods in terms of their performance and implementation complexity, considering the cost of both cancellation and training. We then present a comparative analysis of all these methods to determine which perform better under different system performance requirements. We demonstrate that with a neural network approach, the reduction in arithmetic complexity for the same cancellation performance relative to a state-of-the-art polynomial model is several orders of magnitude.
Non-linear self-interference (SI) cancellation constitutes a fundamental problem in full-duplex communications, which is typically tackled using either polynomial models or neural networks. In this work, we explore the applicability of a recently proposed method based on low-rank tensor completion, called canonical system identification (CSID), to non-linear SI cancellation. Our results show that CSID is very effective in modeling and cancelling the non-linear SI signal and can have lower computational complexity than existing methods, albeit at the cost of increased memory requirements.
Neural networks have become indispensable for a wide range of applications, but they suffer from high computational- and memory-requirements, requiring optimizations from the algorithmic description of the network to the hardware implementation. Moreover, the high rate of innovation in machine learning makes it important that hardware implementations provide a high level of programmability to support current and future requirements of neural networks. In this work, we present a flexible hardware accelerator for neural networks, called Lupulus, supporting various methods for scheduling and mapping of operations onto the accelerator. Lupulus was implemented in a 28nm FD-SOI technology and demonstrates a peak performance of 380 GOPS/GHz with latencies of 21.4ms and 183.6ms for the convolutional layers of AlexNet and VGG-16, respectively.
In this work, we use deep unfolding to view cascaded non-linear RF systems as model-based neural networks. This view enables the direct use of a wide range of neural network tools and optimizers to efficiently identify such cascaded models. We demonstrate the effectiveness of this approach through the example of digital self-interference cancellation in full-duplex communications where an IQ imbalance model and a non-linear PA model are cascaded in series. For a self-interference cancellation performance of approximately 44.5 dB, the number of model parameters can be reduced by 74% and the number of operations per sample can be reduced by 79% compared to an expanded linear-in-parameters polynomial model.