Abstract:Due to the emergence of many sign language datasets, isolated sign language recognition (ISLR) has made significant progress in recent years. In addition, the development of various advanced deep neural networks is another reason for this breakthrough. However, challenges remain in applying the technique in the real world. First, existing sign language datasets do not cover the whole sign vocabulary. Second, most of the sign language datasets provide only single view RGB videos, which makes it difficult to handle hand occlusions when performing ISLR. To fill this gap, this paper presents a dual-view sign language dataset for ISLR named NationalCSL-DP, which fully covers the Chinese national sign language vocabulary. The dataset consists of 134140 sign videos recorded by ten signers with respect to two vertical views, namely, the front side and the left side. Furthermore, a CNN transformer network is also proposed as a strong baseline and an extremely simple but effective fusion strategy for prediction. Extensive experiments were conducted to prove the effectiveness of the datasets as well as the baseline. The results show that the proposed fusion strategy can significantly increase the performance of the ISLR, but it is not easy for the sequence-to-sequence model, regardless of whether the early-fusion or late-fusion strategy is applied, to learn the complementary features from the sign videos of two vertical views.
Abstract:In this paper, we investigate integrated sensing and communication (ISAC) in high-mobility systems with the aid of an intelligent reflecting surface (IRS). To exploit the benefits of Delay-Doppler (DD) spread caused by high mobility, orthogonal time frequency space (OTFS)-based frame structure and transmission framework are proposed. {In such a framework,} we first design a low-complexity ratio-based sensing algorithm for estimating the velocity of mobile user. Then, we analyze the performance of sensing and communication in terms of achievable mean square error (MSE) and achievable rate, respectively, and reveal the impact of key parameters. Next, with the derived performance expressions, we jointly optimize the phase shift matrix of IRS and the receive combining vector at the base station (BS) to improve the overall performance of integrated sensing and communication. Finally, extensive simulation results confirm the effectiveness of the proposed algorithms in high-mobility systems.
Abstract:Intelligent reflecting surface (IRS) has the potential to enhance sensing performance, due to its capability of reshaping the echo signals. Different from the existing literature, which has commonly focused on IRS beamforming optimization, in this paper, we pay special attention to designing effective signal processing approaches to extract sensing information from IRS-reshaped echo signals. To this end, we investigate an IRS-assisted non-line-of-sight (NLOS) target detection and multi-parameter estimation problem in orthogonal frequency division multiplexing (OFDM) systems. To address this problem, we first propose a novel detection and direction estimation framework, including a low-overhead hierarchical codebook that allows the IRS to generate three-dimensional beams with adjustable beam direction and width, a delay spectrum peak-based beam training scheme for detection and direction estimation, and a beam refinement scheme for further enhancing the accuracy of the direction estimation. Then, we propose a target range and velocity estimation scheme by extracting the delay-Doppler information from the IRS-reshaped echo signals. Numerical results demonstrate that the proposed schemes can achieve 99.7% target detection rate, a 10^{-3}-rad level direction estimation accuracy, and a 10^{-6}-m/10^{-5}-m/s level range/velocity estimation accuracy.