While preserving the privacy of federated learning (FL), differential privacy (DP) inevitably degrades the utility (i.e., accuracy) of FL due to model perturbations caused by DP noise added to model updates. Existing studies have considered exclusively noise with persistent root-mean-square amplitude and overlooked an opportunity of adjusting the amplitudes to alleviate the adverse effects of the noise. This paper presents a new DP perturbation mechanism with a time-varying noise amplitude to protect the privacy of FL and retain the capability of adjusting the learning performance. Specifically, we propose a geometric series form for the noise amplitude and reveal analytically the dependence of the series on the number of global aggregations and the $(\epsilon,\delta)$-DP requirement. We derive an online refinement of the series to prevent FL from premature convergence resulting from excessive perturbation noise. Another important aspect is an upper bound developed for the loss function of a multi-layer perceptron (MLP) trained by FL running the new DP mechanism. Accordingly, the optimal number of global aggregations is obtained, balancing the learning and privacy. Extensive experiments are conducted using MLP, supporting vector machine, and convolutional neural network models on four public datasets. The contribution of the new DP mechanism to the convergence and accuracy of privacy-preserving FL is corroborated, compared to the state-of-the-art Gaussian noise mechanism with a persistent noise amplitude.
This paper presents a new deep reinforcement learning (DRL)-based approach to the trajectory planning and jamming rejection of an unmanned aerial vehicle (UAV) for the Internet-of-Things (IoT) applications. Jamming can prevent timely delivery of sensing data and reception of operation instructions. With the assistance of a reconfigurable intelligent surface (RIS), we propose to augment the radio environment, suppress jamming signals, and enhance the desired signals. The UAV is designed to learn its trajectory and the RIS configuration based solely on changes in its received data rate, using the latest deep deterministic policy gradient (DDPG) and twin delayed DDPG (TD3) models. Simulations show that the proposed DRL algorithms give the UAV with strong resistance against jamming and that the TD3 algorithm exhibits faster and smoother convergence than the DDPG algorithm, and suits better for larger RISs. This DRL-based approach eliminates the need for knowledge of the channels involving the RIS and jammer, thereby offering significant practical value.
The domain adaptation (DA) approaches available to date are usually not well suited for practical DA scenarios of remote sensing image classification, since these methods (such as unsupervised DA) rely on rich prior knowledge about the relationship between label sets of source and target domains, and source data are often not accessible due to privacy or confidentiality issues. To this end, we propose a practical universal domain adaptation setting for remote sensing image scene classification that requires no prior knowledge on the label sets. Furthermore, a novel universal domain adaptation method without source data is proposed for cases when the source data is unavailable. The architecture of the model is divided into two parts: the source data generation stage and the model adaptation stage. The first stage estimates the conditional distribution of source data from the pre-trained model using the knowledge of class-separability in the source domain and then synthesizes the source data. With this synthetic source data in hand, it becomes a universal DA task to classify a target sample correctly if it belongs to any category in the source label set, or mark it as ``unknown" otherwise. In the second stage, a novel transferable weight that distinguishes the shared and private label sets in each domain promotes the adaptation in the automatically discovered shared label set and recognizes the ``unknown'' samples successfully. Empirical results show that the proposed model is effective and practical for remote sensing image scene classification, regardless of whether the source data is available or not. The code is available at https://github.com/zhu-xlab/UniDA.
Reconfigurable intelligent surfaces (RISs) can potentially combat jamming attacks by diffusing jamming signals. This paper jointly optimizes user selection, channel allocation, modulation-coding, and RIS configuration in a multiuser OFDMA system under a jamming attack. This problem is non-trivial and has never been addressed, because of its mixed-integer programming nature and difficulties in acquiring channel state information (CSI) involving the RIS and jammer. We propose a new deep reinforcement learning (DRL)-based approach, which learns only through changes in the received data rates of the users to reject the jamming signals and maximize the sum rate of the system. The key idea is that we decouple the discrete selection of users, channels, and modulation-coding from the continuous RIS configuration, hence facilitating the RIS configuration with the latest twin delayed deep deterministic policy gradient (TD3) model. Another important aspect is that we show a winner-takes-all strategy is almost surely optimal for selecting the users, channels, and modulation-coding, given a learned RIS configuration. Simulations show that the new approach converges fast to fulfill the benefit of the RIS, due to its substantially small state and action spaces. Without the need of the CSI, the approach is promising and offers practical value.
Integrated sensing and communication (ISAC) has the advantages of efficient spectrum utilization and low hardware cost. It is promising to be implemented in the fifth-generation-advanced (5G-A) and sixth-generation (6G) mobile communication systems, having the potential to be applied in intelligent applications requiring both communication and high-accurate sensing capabilities. As the fundamental technology of ISAC, ISAC signal directly impacts the performance of sensing and communication. This article systematically reviews the literature on ISAC signals from the perspective of mobile communication systems, including ISAC signal design, ISAC signal processing algorithms and ISAC signal optimization. We first review the ISAC signal design based on 5G, 5G-A and 6G mobile communication systems. Then, radar signal processing methods are reviewed for ISAC signals, mainly including the channel information matrix method, spectrum lines estimator method and super resolution method. In terms of signal optimization, we summarize peak-to-average power ratio (PAPR) optimization, interference management, and adaptive signal optimization for ISAC signals. This article may provide the guidelines for the research of ISAC signals in 5G-A and 6G mobile communication systems.
Computational reconstruction plays a vital role in computer vision and computational photography. Most of the conventional optimization and deep learning techniques explore local information for reconstruction. Recently, nonlocal low-rank (NLR) reconstruction has achieved remarkable success in improving accuracy and generalization. However, the computational cost has inhibited NLR from seeking global structural similarity, which consequentially keeps it trapped in the tradeoff between accuracy and efficiency and prevents it from high-dimensional large-scale tasks. To address this challenge, we report here the global low-rank (GLR) optimization technique, realizing highly-efficient large-scale reconstruction with global self-similarity. Inspired by the self-attention mechanism in deep learning, GLR extracts exemplar image patches by feature detection instead of conventional uniform selection. This directly produces key patches using structural features to avoid burdensome computational redundancy. Further, it performs patch matching across the entire image via neural-based convolution, which produces the global similarity heat map in parallel, rather than conventional sequential block-wise matching. As such, GLR improves patch grouping efficiency by more than one order of magnitude. We experimentally demonstrate GLR's effectiveness on temporal, frequency, and spectral dimensions, including different computational imaging modalities of compressive temporal imaging, magnetic resonance imaging, and multispectral filter array demosaicing. This work presents the superiority of inherent fusion of deep learning strategies and iterative optimization, and breaks the persistent dilemma of the tradeoff between accuracy and efficiency for various large-scale reconstruction tasks.
Recently, Transformer architecture has been introduced into image restoration to replace convolution neural network (CNN) with surprising results. Considering the high computational complexity of Transformer with global attention, some methods use the local square window to limit the scope of self-attention. However, these methods lack direct interaction among different windows, which limits the establishment of long-range dependencies. To address the above issue, we propose a new image restoration model, Cross Aggregation Transformer (CAT). The core of our CAT is the Rectangle-Window Self-Attention (Rwin-SA), which utilizes horizontal and vertical rectangle window attention in different heads parallelly to expand the attention area and aggregate the features cross different windows. We also introduce the Axial-Shift operation for different window interactions. Furthermore, we propose the Locality Complementary Module to complement the self-attention mechanism, which incorporates the inductive bias of CNN (e.g., translation invariance and locality) into Transformer, enabling global-local coupling. Extensive experiments demonstrate that our CAT outperforms recent state-of-the-art methods on several image restoration applications. The code and models are available at https://github.com/zhengchen1999/CAT.
While deep learning-based text-to-speech (TTS) models such as VITS have shown excellent results, they typically require a sizable set of high-quality <text, audio> pairs to train, which is expensive to collect. So far, most languages in the world still lack the training data needed to develop TTS systems. This paper proposes two improvement methods for the two problems faced by low-resource Mongolian speech synthesis: a) In view of the lack of high-quality <text, audio> pairs of data, it is difficult to model the mapping problem from linguistic features to acoustic features. Improvements are made using pre-trained VITS model and transfer learning methods. b) In view of the problem of less labeled information, this paper proposes to use an automatic prosodic annotation method to label the prosodic information of text and corresponding speech, thereby improving the naturalness and intelligibility of low-resource Mongolian language. Through empirical research, the N-MOS of the method proposed in this paper is 4.195, and the I-MOS is 4.228.
Underwater automatic target recognition (UATR) has been a challenging research topic in ocean engineering. Although deep learning brings opportunities for target recognition on land and in the air, underwater target recognition techniques based on deep learning have lagged due to sensor performance and the size of trainable data. This letter proposed a framework for learning the visual representation of underwater acoustic imageries, which takes a transformer-based style transfer model as the main body. It could replace the low-level texture features of optical images with the visual features of underwater acoustic imageries while preserving their raw high-level semantic content. The proposed framework could fully use the rich optical image dataset to generate a pseudo-acoustic image dataset and use it as the initial sample to train the underwater acoustic target recognition model. The experiments select the dual-frequency identification sonar (DIDSON) as the underwater acoustic data source and also take fish, the most common marine creature, as the research subject. Experimental results show that the proposed method could generate high-quality and high-fidelity pseudo-acoustic samples, achieve the purpose of acoustic data enhancement and provide support for the underwater acoustic-optical images domain transfer research.
In this paper, we propose a novel Kalman Filter (KF)-based uplink (UL) joint communication and sensing (JCAS) scheme, which can significantly reduce the range and location estimation errors due to the clock asynchronism between the base station (BS) and user equipment (UE). Clock asynchronism causes time-varying time offset (TO) and carrier frequency offset (CFO), leading to major challenges in uplink sensing. Unlike existing technologies, our scheme does not require knowing the location of the UE in advance, and retains the linearity of the sensing parameter estimation problem. We first estimate the angle-of-arrivals (AoAs) of multipaths and use them to spatially filter the CSI. Then, we propose a KF-based CSI enhancer that exploits the estimation of Doppler with CFO as the prior information to significantly suppress the time-varying noise-like TO terms in spatially filtered CSIs. Subsequently, we can estimate the accurate ranges of UE and the scatterers based on the KF-enhanced CSI. Finally, we identify the UE's AoA and range estimation and locate UE, then locate the dumb scatterers using the bi-static system. Simulation results validate the proposed scheme. The localization root mean square error of the proposed method is about 20 dB lower than the benchmarking scheme.