Most existing works on dual-function radar-communication (DFRC) systems mainly focus on active sensing, but ignore passive sensing. To leverage multi-static sensing capability, we explore integrated active and passive sensing (IAPS) in DFRC systems to remedy sensing performance. The multi-antenna base station (BS) is responsible for communication and active sensing by transmitting signals to user equipments while detecting a target according to echo signals. In contrast, passive sensing is performed at the receive access points (RAPs). We consider both the cases where the capacity of the backhaul links between the RAPs and BS is unlimited or limited and adopt different fusion strategies. Specifically, when the backhaul capacity is unlimited, the BS and RAPs transfer sensing signals they have received to the central controller (CC) for signal fusion. The CC processes the signals and leverages the generalized likelihood ratio test detector to determine the present of a target. However, when the backhaul capacity is limited, each RAP, as well as the BS, makes decisions independently and sends its binary inference results to the CC for result fusion via voting aggregation. Then, aiming at maximize the target detection probability under communication quality of service constraints, two power optimization algorithms are proposed. Finally, numerical simulations demonstrate that the sensing performance in case of unlimited backhaul capacity is much better than that in case of limited backhaul capacity. Moreover, it implied that the proposed IAPS scheme outperforms only-passive and only-active sensing schemes, especially in unlimited capacity case.
In image dehazing task, haze density is a key feature and affects the performance of dehazing methods. However, some of the existing methods lack a comparative image to measure densities, and others create intermediate results but lack the exploitation of their density differences, which can facilitate perception of density. To address these deficiencies, we propose a density-aware dehazing method named Density Feature Refinement Network (DFR-Net) that extracts haze density features from density differences and leverages density differences to refine density features. In DFR-Net, we first generate a proposal image that has lower overall density than the hazy input, bringing in global density differences. Additionally, the dehazing residual of the proposal image reflects the level of dehazing performance and provides local density differences that indicate localized hard dehazing or high density areas. Subsequently, we introduce a Global Branch (GB) and a Local Branch (LB) to achieve density-awareness. In GB, we use Siamese networks for feature extraction of hazy inputs and proposal images, and we propose a Global Density Feature Refinement (GDFR) module that can refine features by pushing features with different global densities further away. In LB, we explore local density features from the dehazing residuals between hazy inputs and proposal images and introduce an Intermediate Dehazing Residual Feedforward (IDRF) module to update local features and pull them closer to clear image features. Sufficient experiments demonstrate that the proposed method achieves results beyond the state-of-the-art methods on various datasets.
Early smoke segmentation (ESS) enables the accurate identification of smoke sources, facilitating the prompt extinguishing of fires and preventing large-scale gas leaks. But ESS poses greater challenges than conventional object and regular smoke segmentation due to its small scale and transparent appearance, which can result in high miss detection rate and low precision. To address these issues, a Focus and Separation Network (FoSp) is proposed. We first introduce a Focus module employing bidirectional cascade which guides low-resolution and high-resolution features towards mid-resolution to locate and determine the scope of smoke, reducing the miss detection rate. Next, we propose a Separation module that separates smoke images into a pure smoke foreground and a smoke-free background, enhancing the contrast between smoke and background fundamentally, improving segmentation precision. Finally, a Domain Fusion module is developed to integrate the distinctive features of the two modules which can balance recall and precision to achieve high F_beta. Futhermore, to promote the development of ESS, we introduce a high-quality real-world dataset called SmokeSeg, which contains more small and transparent smoke than the existing datasets. Experimental results show that our model achieves the best performance on three available datasets: SYN70K (mIoU: 83.00%), SMOKE5K (F_beta: 81.6%) and SmokeSeg (F_beta: 72.05%). Especially, our FoSp outperforms SegFormer by 7.71% (F_beta) for early smoke segmentation on SmokeSeg.
Federated learning (FL) enables distributed clients to collaboratively train a machine learning model without sharing raw data with each other. However, it suffers the leakage of private information from uploading models. In addition, as the model size grows, the training latency increases due to limited transmission bandwidth and the model performance degrades while using differential privacy (DP) protection. In this paper, we propose a gradient sparsification empowered FL framework over wireless channels, in order to improve training efficiency without sacrificing convergence performance. Specifically, we first design a random sparsification algorithm to retain a fraction of the gradient elements in each client's local training, thereby mitigating the performance degradation induced by DP and and reducing the number of transmission parameters over wireless channels. Then, we analyze the convergence bound of the proposed algorithm, by modeling a non-convex FL problem. Next, we formulate a time-sequential stochastic optimization problem for minimizing the developed convergence bound, under the constraints of transmit power, the average transmitting delay, as well as the client's DP requirement. Utilizing the Lyapunov drift-plus-penalty framework, we develop an analytical solution to the optimization problem. Extensive experiments have been implemented on three real life datasets to demonstrate the effectiveness of our proposed algorithm. We show that our proposed algorithms can fully exploit the interworking between communication and computation to outperform the baselines, i.e., random scheduling, round robin and delay-minimization algorithms.
Obviously, the object is the key factor of the 3D single object tracking (SOT) task. However, previous Siamese-based trackers overlook the negative effects brought by randomly dropped object points during backbone sampling, which hinder trackers to predict accurate bounding boxes (BBoxes). Exploring an approach that seeks to maximize the preservation of object points and their object-aware features is of particular significance. Motivated by this, we propose an Object Preserving Siamese Network (OPSNet), which can significantly maintain object integrity and boost tracking performance. Firstly, the object highlighting module enhances the object-aware features and extracts discriminative features from template and search area. Then, the object-preserved sampling selects object candidates to obtain object-preserved search area seeds and drop the background points that contribute less to tracking. Finally, the object localization network precisely locates 3D BBoxes based on the object-preserved search area seeds. Extensive experiments demonstrate our method outperforms the state-of-the-art performance (9.4% and 2.5% success gain on KITTI and Waymo Open Dataset respectively).
Infrared small target detection (ISTD) under complex backgrounds is a difficult problem, for the differences between targets and backgrounds are not easy to distinguish. Background reconstruction is one of the methods to deal with this problem. This paper proposes an ISTD method based on background reconstruction called Dynamic Background Reconstruction (DBR). DBR consists of three modules: a dynamic shift window module (DSW), a background reconstruction module (BR), and a detection head (DH). BR takes advantage of Vision Transformers in reconstructing missing patches and adopts a grid masking strategy with a masking ratio of 50\% to reconstruct clean backgrounds without targets. To avoid dividing one target into two neighboring patches, resulting in reconstructing failure, DSW is performed before input embedding. DSW calculates offsets, according to which infrared images dynamically shift. To reduce False Positive (FP) cases caused by regarding reconstruction errors as targets, DH utilizes a structure of densely connected Transformer to further improve the detection performance. Experimental results show that DBR achieves the best F1-score on the two ISTD datasets, MFIRST (64.10\%) and SIRST (75.01\%).
Infrared small-target detection (ISTD) is an important computer vision task. ISTD aims at separating small targets from complex background clutter. The infrared radiation decays over distances, making the targets highly dim and prone to confusion with the background clutter, which makes the detector challenging to balance the precision and recall rate. To deal with this difficulty, this paper proposes a neural-network-based ISTD method called CourtNet, which has three sub-networks: the prosecution network is designed for improving the recall rate; the defendant network is devoted to increasing the precision rate; the jury network weights their results to adaptively balance the precision and recall rate. Furthermore, the prosecution network utilizes a densely connected transformer structure, which can prevent small targets from disappearing in the network forward propagation. In addition, a fine-grained attention module is adopted to accurately locate the small targets. Experimental results show that CourtNet achieves the best F1-score on the two ISTD datasets, MFIRST (0.62) and SIRST (0.73).
Automatic modulation classification (AMC) is a key technique for desiging non-cooperative communication systems, and deep learning (DL) is applied effectively into AMC for improving the classification accuracy.However, most of the DL-based AMC methods have a large number of parameters and high computational complexity, and they cannot be directly applied into scenarios with limited computing power and storage space.In this paper, we propose a lightweight and low-complexity AMC method using ultra lite convolutional neural network (ULCNN), which is based on multiple tricks, including data augmentation, complex-valued convolution, separable convolution, channel attention, channel shuffle. Simulation results demonstrate that our proposed ULCNN-based AMC method achieves the average accuracy of 62.47\% on RML2016.10a and only 9,751 parameters. Moreover, ULCNN is verified on a typical edge device (Raspberry Pi), where interference time per sample is about 0.775 ms. The reproducible code can be download from GitHub at https://github.com/BeechburgPieStar/Ultra-Lite-Convolutional-Neural-Network-for-Automatic-Modulation-Classification .
Recently, deep neural network (DNN)-based physical layer communication techniques have attracted considerable interest. Although their potential to enhance communication systems and superb performance have been validated by simulation experiments, little attention has been paid to the theoretical analysis. Specifically, most studies in the physical layer have tended to focus on the application of DNN models to wireless communication problems but not to theoretically understand how does a DNN work in a communication system. In this paper, we aim to quantitatively analyze why DNNs can achieve comparable performance in the physical layer comparing with traditional techniques, and also drive their cost in terms of computational complexity. To achieve this goal, we first analyze the encoding performance of a DNN-based transmitter and compare it to a traditional one. And then, we theoretically analyze the performance of DNN-based estimator and compare it with traditional estimators. Third, we investigate and validate how information is flown in a DNN-based communication system under the information theoretic concepts. Our analysis develops a concise way to open the "black box" of DNNs in physical layer communication, which can be applied to support the design of DNN-based intelligent communication techniques and help to provide explainable performance assessment.
Based on our observations of infrared targets, serious scale variation along within sequence frames has high-frequently occurred. In this paper, we propose a dynamic re-parameterization network (DRPN) to deal with the scale variation and balance the detection precision between small targets and large targets in infrared datasets. DRPN adopts the multiple branches with different sizes of convolution kernels and the dynamic convolution strategy. Multiple branches with different sizes of convolution kernels have different sizes of receptive fields. Dynamic convolution strategy makes DRPN adaptively weight multiple branches. DRPN can dynamically adjust the receptive field according to the scale variation of the target. Besides, in order to maintain effective inference in the test phase, the multi-branch structure is further converted to a single-branch structure via the re-parameterization technique after training. Extensive experiments on FLIR, KAIST, and InfraPlane datasets demonstrate the effectiveness of our proposed DRPN. The experimental results show that detectors using the proposed DRPN as the basic structure rather than SKNet or TridentNet obtained the best performances.