Shitz
Abstract:Second-order methods are widely adopted to improve the convergence rate of learning algorithms. In federated learning (FL), these methods require the clients to share their local Hessian matrices with the parameter server (PS), which comes at a prohibitive communication cost. A classical solution to this issue is to approximate the global Hessian matrix from the first-order information. Unlike in idealized networks, this solution does not perform effectively in over-the-air FL settings, where the PS receives noisy versions of the local gradients. This paper introduces a novel second-order FL framework tailored for wireless channels. The pivotal innovation lies in the PS's capability to directly estimate the global Hessian matrix from the received noisy local gradients via a non-parametric method: the PS models the unknown Hessian matrix as a Gaussian process, and then uses the temporal relation between the gradients and Hessian along with the channel model to find a stochastic estimator for the global Hessian matrix. We refer to this method as Gaussian process-based Hessian modeling for wireless FL (GP-FL) and show that it exhibits a linear-quadratic convergence rate. Numerical experiments on various datasets demonstrate that GP-FL outperforms all classical baseline first and second order FL approaches.




Abstract:Flexible-antenna systems have recently received significant research interest due to their capability to reconfigure wireless channels intelligently. This paper focuses on a new type of flexible-antenna technology, termed pinching antennas, which can be realized by applying small dielectric particles on a waveguide. Analytical results are first developed for the simple case with a single pinching antenna and a single waveguide, where the unique feature of the pinching-antenna system to create strong line-of-sight links and mitigate large-scale path loss is demonstrated. An advantageous feature of pinching-antenna systems is that multiple pinching antennas can be activated on a single waveguide at no extra cost; however, they must be fed with the same signal. This feature motivates the application of non-orthogonal multiple access (NOMA), and analytical results are provided to demonstrate the superior performance of NOMA-assisted pinching-antenna systems. Finally, the case with multiple pinching antennas and multiple waveguides is studied, which resembles a classical multiple-input single-input (MISO) interference channel. By exploiting the capability of pinching antennas to reconfigure the wireless channel, it is revealed that a performance upper bound on the interference channel becomes achievable, where the achievability conditions are also identified. Computer simulation results are presented to verify the developed analytical results and demonstrate the superior performance of pinching-antenna systems.




Abstract:In this paper, we consider a point-to-point integrated sensing and communication (ISAC) system, where a transmitter conveys a message to a receiver over a channel with memory and simultaneously estimates the state of the channel through the backscattered signals from the emitted waveform. Using Massey's concept of directed information for channels with memory, we formulate the capacity-distortion tradeoff for the ISAC problem when sensing is performed in an online fashion. Optimizing the transmit waveform for this system to simultaneously achieve good communication and sensing performance is a complicated task, and thus we propose a deep reinforcement learning (RL) approach to find a solution. The proposed approach enables the agent to optimize the ISAC performance by learning a reward that reflects the difference between the communication gain and the sensing loss. Since the state-space in our RL model is \`a priori unbounded, we employ deep deterministic policy gradient algorithm (DDPG). Our numerical results suggest a significant performance improvement when one considers unbounded state-space as opposed to a simpler RL problem with reduced state-space. In the extreme case of degenerate state-space only memoryless signaling strategies are possible. Our results thus emphasize the necessity of well exploiting the memory inherent in ISAC systems.
Abstract:Data injection attacks (DIAs) pose a significant cybersecurity threat to the Smart Grid by enabling an attacker to compromise the integrity of data acquisition and manipulate estimated states without triggering bad data detection procedures. To mitigate this vulnerability, the moving target defense (MTD) alters branch admittances to mismatch the system information that is available to an attacker, thereby inducing an imperfect DIA construction that results in degradation of attack performance. In this paper, we first analyze the existence of stealth attacks for the case in which the MTD strategy only changes the admittance of a single branch. Equipped with this initial insight, we then extend the results to the case in which multiple branches are protected by the MTD strategy. Remarkably, we show that stealth attacks can be constructed with information only about which branches are protected, without knowledge about the particular admittance value changes. Furthermore, we provide a sufficient protection condition for the MTD strategy via graph-theoretic tools that guarantee that the system is not vulnerable to DIAs. Numerical simulations are implemented on IEEE test systems to validate the obtained results.
Abstract:Co-channel interference cancellation (CCI) is the process used to reduce interference from other signals using the same frequency channel, thereby enhancing the performance of wireless communication systems. An improvement to this approach is blind CCI, which reduces interference without relying on prior knowledge of the interfering signal characteristics. Recent work suggested using machine learning (ML) models for this purpose, but high-throughput ML solutions are still lacking, especially for edge devices with limited resources. This work explores the adaptation of U-Net Convolutional Neural Network models for high-throughput blind source separation. Our approach is established on architectural modifications, notably through quantization and the incorporation of depthwise separable convolution, to achieve a balance between computational efficiency and performance. Our results demonstrate that the proposed models achieve superior MSE scores when removing unknown interference sources from the signals while maintaining significantly lower computational complexity compared to baseline models. One of our proposed models is deeper and fully convolutional, while the other is shallower with a convolutional structure incorporating an LSTM. Depthwise separable convolution and quantization further reduce the memory footprint and computational demands, albeit with some performance trade-offs. Specifically, applying depthwise separable convolutions to the model with the LSTM results in only a 0.72% degradation in MSE score while reducing MACs by 58.66%. For the fully convolutional model, we observe a 0.63% improvement in MSE score with even 61.10% fewer MACs. Overall, our findings underscore the feasibility of using optimized machine-learning models for interference cancellation in devices with limited resources.




Abstract:Inspired by biological processes, neuromorphic computing utilizes spiking neural networks (SNNs) to perform inference tasks, offering significant efficiency gains for workloads involving sequential data. Recent advances in hardware and software have demonstrated that embedding a few bits of payload in each spike exchanged between the spiking neurons can further enhance inference accuracy. In a split computing architecture, where the SNN is divided across two separate devices, the device storing the first layers must share information about the spikes generated by the local output neurons with the other device. Consequently, the advantages of multi-level spikes must be balanced against the challenges of transmitting additional bits between the two devices. This paper addresses these challenges by investigating a wireless neuromorphic split computing architecture employing multi-level SNNs. For this system, we present the design of digital and analog modulation schemes optimized for an orthogonal frequency division multiplexing (OFDM) radio interface. Simulation and experimental results using software-defined radios provide insights into the performance gains of multi-level SNN models and the optimal payload size as a function of the quality of the connection between a transmitter and receiver.




Abstract:Network optimization is a fundamental challenge in the Internet of Things (IoT) network, often characterized by complex features that make it difficult to solve these problems. Recently, generative diffusion models (GDMs) have emerged as a promising new approach to network optimization, with the potential to directly address these optimization problems. However, the application of GDMs in this field is still in its early stages, and there is a noticeable lack of theoretical research and empirical findings. In this study, we first explore the intrinsic characteristics of generative models. Next, we provide a concise theoretical proof and intuitive demonstration of the advantages of generative models over discriminative models in network optimization. Based on this exploration, we implement GDMs as optimizers aimed at learning high-quality solution distributions for given inputs, sampling from these distributions during inference to approximate or achieve optimal solutions. Specifically, we utilize denoising diffusion probabilistic models (DDPMs) and employ a classifier-free guidance mechanism to manage conditional guidance based on input parameters. We conduct extensive experiments across three challenging network optimization problems. By investigating various model configurations and the principles of GDMs as optimizers, we demonstrate the ability to overcome prediction errors and validate the convergence of generated solutions to optimal solutions.We provide code and data at https://github.com/qiyu3816/DiffSG.




Abstract:Collaborative inference among multiple wireless edge devices has the potential to significantly enhance Artificial Intelligence (AI) applications, particularly for sensing and computer vision. This approach typically involves a three-stage process: a) data acquisition through sensing, b) feature extraction, and c) feature encoding for transmission. However, transmitting the extracted features poses a significant privacy risk, as sensitive personal data can be exposed during the process. To address this challenge, we propose a novel privacy-preserving collaborative inference mechanism, wherein each edge device in the network secures the privacy of extracted features before transmitting them to a central server for inference. Our approach is designed to achieve two primary objectives: 1) reducing communication overhead and 2) ensuring strict privacy guarantees during feature transmission, while maintaining effective inference performance. Additionally, we introduce an over-the-air pooling scheme specifically designed for classification tasks, which provides formal guarantees on the privacy of transmitted features and establishes a lower bound on classification accuracy.
Abstract:This paper addresses the problem of detecting changes when only unnormalized pre- and post-change distributions are accessible. This situation happens in many scenarios in physics such as in ferromagnetism, crystallography, magneto-hydrodynamics, and thermodynamics, where the energy models are difficult to normalize. Our approach is based on the estimation of the Cumulative Sum (CUSUM) statistics, which is known to produce optimal performance. We first present an intuitively appealing approximation method. Unfortunately, this produces a biased estimator of the CUSUM statistics and may cause performance degradation. We then propose the Log-Partition Approximation Cumulative Sum (LPA-CUSUM) algorithm based on thermodynamic integration (TI) in order to estimate the log-ratio of normalizing constants of pre- and post-change distributions. It is proved that this approach gives an unbiased estimate of the log-partition function and the CUSUM statistics, and leads to an asymptotically optimal performance. Moreover, we derive a relationship between the required sample size for thermodynamic integration and the desired detection delay performance, offering guidelines for practical parameter selection. Numerical studies are provided demonstrating the efficacy of our approach.




Abstract:The effect of relative entropy asymmetry is analyzed in the context of empirical risk minimization (ERM) with relative entropy regularization (ERM-RER). Two regularizations are considered: $(a)$ the relative entropy of the measure to be optimized with respect to a reference measure (Type-I ERM-RER); or $(b)$ the relative entropy of the reference measure with respect to the measure to be optimized (Type-II ERM-RER). The main result is the characterization of the solution to the Type-II ERM-RER problem and its key properties. By comparing the well-understood Type-I ERM-RER with Type-II ERM-RER, the effects of entropy asymmetry are highlighted. The analysis shows that in both cases, regularization by relative entropy forces the solution's support to collapse into the support of the reference measure, introducing a strong inductive bias that can overshadow the evidence provided by the training data. Finally, it is shown that Type-II regularization is equivalent to Type-I regularization with an appropriate transformation of the empirical risk function.