Digital twin (DT) platforms are increasingly regarded as a promising technology for controlling, optimizing, and monitoring complex engineering systems such as next-generation wireless networks. An important challenge in adopting DT solutions is their reliance on data collected offline, lacking direct access to the physical environment. This limitation is particularly severe in multi-agent systems, for which conventional multi-agent reinforcement (MARL) requires online interactions with the environment. A direct application of online MARL schemes to an offline setting would generally fail due to the epistemic uncertainty entailed by the limited availability of data. In this work, we propose an offline MARL scheme for DT-based wireless networks that integrates distributional RL and conservative Q-learning to address the environment's inherent aleatoric uncertainty and the epistemic uncertainty arising from limited data. To further exploit the offline data, we adapt the proposed scheme to the centralized training decentralized execution framework, allowing joint training of the agents' policies. The proposed MARL scheme, referred to as multi-agent conservative quantile regression (MA-CQR) addresses general risk-sensitive design criteria and is applied to the trajectory planning problem in drone networks, showcasing its advantages.
In a manner analogous to their classical counterparts, quantum classifiers are vulnerable to adversarial attacks that perturb their inputs. A promising countermeasure is to train the quantum classifier by adopting an attack-aware, or adversarial, loss function. This paper studies the generalization properties of quantum classifiers that are adversarially trained against bounded-norm white-box attacks. Specifically, a quantum adversary maximizes the classifier's loss by transforming an input state $\rho(x)$ into a state $\lambda$ that is $\epsilon$-close to the original state $\rho(x)$ in $p$-Schatten distance. Under suitable assumptions on the quantum embedding $\rho(x)$, we derive novel information-theoretic upper bounds on the generalization error of adversarially trained quantum classifiers for $p = 1$ and $p = \infty$. The derived upper bounds consist of two terms: the first is an exponential function of the 2-R\'enyi mutual information between classical data and quantum embedding, while the second term scales linearly with the adversarial perturbation size $\epsilon$. Both terms are shown to decrease as $1/\sqrt{T}$ over the training set size $T$ . An extension is also considered in which the adversary assumed during training has different parameters $p$ and $\epsilon$ as compared to the adversary affecting the test inputs. Finally, we validate our theoretical findings with numerical experiments for a synthetic setting.
Conformal risk control (CRC) is a recently proposed technique that applies post-hoc to a conventional point predictor to provide calibration guarantees. Generalizing conformal prediction (CP), with CRC, calibration is ensured for a set predictor that is extracted from the point predictor to control a risk function such as the probability of miscoverage or the false negative rate. The original CRC requires the available data set to be split between training and validation data sets. This can be problematic when data availability is limited, resulting in inefficient set predictors. In this paper, a novel CRC method is introduced that is based on cross-validation, rather than on validation as the original CRC. The proposed cross-validation CRC (CV-CRC) extends a version of the jackknife-minmax from CP to CRC, allowing for the control of a broader range of risk functions. CV-CRC is proved to offer theoretical guarantees on the average risk of the set predictor. Furthermore, numerical experiments show that CV-CRC can reduce the average set size with respect to CRC when the available data are limited.
The safe integration of machine learning modules in decision-making processes hinges on their ability to quantify uncertainty. A popular technique to achieve this goal is conformal prediction (CP), which transforms an arbitrary base predictor into a set predictor with coverage guarantees. While CP certifies the predicted set to contain the target quantity with a user-defined tolerance, it does not provide control over the average size of the predicted sets, i.e., over the informativeness of the prediction. In this work, a theoretical connection is established between the generalization properties of the base predictor and the informativeness of the resulting CP prediction sets. To this end, an upper bound is derived on the expected size of the CP set predictor that builds on generalization error bounds for the base predictor. The derived upper bound provides insights into the dependence of the average size of the CP set predictor on the amount of calibration data, the target reliability, and the generalization performance of the base predictor. The theoretical insights are validated using simple numerical regression and classification tasks.
This paper presents a novel approach to enhance the communication efficiency of federated learning (FL) in multiple input and multiple output (MIMO) wireless systems. The proposed method centers on a low-rank matrix factorization strategy for local gradient compression based on alternating least squares, along with over-the-air computation and error feedback. The proposed protocol, termed over-the-air low-rank compression (Ota-LC), is demonstrated to have lower computation cost and lower communication overhead as compared to existing benchmarks while guaranteeing the same inference performance. As an example, when targeting a test accuracy of 80% on the Cifar-10 dataset, Ota-LC achieves a reduction in total communication costs of at least 30% when contrasted with benchmark schemes, while also reducing the computational complexity order by a factor equal to the sum of the dimension of the gradients.
Spiking neural networks (SNNs) implemented on neuromorphic processors (NPs) can enhance the energy efficiency of deployments of artificial intelligence (AI) for specific workloads. As such, NP represents an interesting opportunity for implementing AI tasks on board power-limited satellite communication spacecraft. In this article, we disseminate the findings of a recently completed study which targeted the comparison in terms of performance and power-consumption of different satellite communication use cases implemented on standard AI accelerators and on NPs. In particular, the article describes three prominent use cases, namely payload resource optimization, onboard interference detection and classification, and dynamic receive beamforming; and compare the performance of conventional convolutional neural networks (CNNs) implemented on Xilinx's VCK5000 Versal development card and SNNs on Intel's neuromorphic chip Loihi 2.
Embodying the principle of simulation intelligence, digital twin (DT) systems construct and maintain a high-fidelity virtual model of a physical system. This paper focuses on ray tracing (RT), which is widely seen as an enabling technology for DTs of the radio access network (RAN) segment of next-generation disaggregated wireless systems. RT makes it possible to simulate channel conditions, enabling data augmentation and prediction-based transmission. However, the effectiveness of RT hinges on the adaptation of the electromagnetic properties assumed by the RT to actual channel conditions, a process known as calibration. The main challenge of RT calibration is the fact that small discrepancies in the geometric model fed to the RT software hinder the accuracy of the predicted phases of the simulated propagation paths. Existing solutions to this problem either rely on the channel power profile, hence disregarding phase information, or they operate on the channel responses by assuming the simulated phases to be sufficiently accurate for calibration. This paper proposes a novel channel response-based scheme that, unlike the state of the art, estimates and compensates for the phase errors in the RT-generated channel responses. The proposed approach builds on the variational expectation maximization algorithm with a flexible choice of the prior phase-error distribution that bridges between a deterministic model with no phase errors and a stochastic model with uniform phase errors. The algorithm is computationally efficient, and is demonstrated, by leveraging the open-source differentiable RT software available within the Sionna library, to outperform existing methods in terms of the accuracy of RT predictions.
Consider an active learning setting in which a learner has a training set with few labeled examples and a pool set with many unlabeled inputs, while a remote teacher has a pre-trained model that is known to perform well for the learner's task. The learner actively transmits batches of unlabeled inputs to the teacher through a constrained communication channel for labeling. This paper addresses the following key questions: (i) Active batch selection: Which batch of inputs should be sent to the teacher to acquire the most useful information and thus reduce the number of required communication rounds? (ii) Batch encoding: How do we encode the batch of inputs for transmission to the teacher to reduce the communication resources required at each round? We introduce Communication-Constrained Bayesian Active Knowledge Distillation (CC-BAKD), a novel protocol that integrates Bayesian active learning with compression via a linear mix-up mechanism. Bayesian active learning selects the batch of inputs based on their epistemic uncertainty, addressing the "confirmation bias" that is known to increase the number of required communication rounds. Furthermore, the proposed mix-up compression strategy is integrated with the epistemic uncertainty-based active batch selection process to reduce the communication overhead per communication round.
Large pre-trained sequence models, such as transformer-based architectures, have been recently shown to have the capacity to carry out in-context learning (ICL). In ICL, a decision on a new input is made via a direct mapping of the input and of a few examples from the given task, serving as the task's context, to the output variable. No explicit updates of model parameters are needed to tailor the decision to a new task. Pre-training, which amounts to a form of meta-learning, is based on the observation of examples from several related tasks. Prior work has shown ICL capabilities for linear regression. In this study, we leverage ICL to address the inverse problem of multiple-input and multiple-output (MIMO) equalization based on a context given by pilot symbols. A task is defined by the unknown fading channel and by the signal-to-noise ratio (SNR) level, which may be known. To highlight the practical potential of the approach, we allow for the presence of quantization of the received signals. We demonstrate via numerical results that transformer-based ICL has a threshold behavior, whereby, as the number of pre-training tasks grows, the performance switches from that of a minimum mean squared error (MMSE) equalizer with a prior determined by the pre-trained tasks to that of an MMSE equalizer with the true data-generating prior.
Addressing the communication bottleneck inherent in federated learning (FL), over-the-air FL (AirFL) has emerged as a promising solution, which is, however, hampered by deep fading conditions. In this paper, we propose AirFL-Mem, a novel scheme designed to mitigate the impact of deep fading by implementing a \emph{long-term} memory mechanism. Convergence bounds are provided that account for long-term memory, as well as for existing AirFL variants with short-term memory, for general non-convex objectives. The theory demonstrates that AirFL-Mem exhibits the same convergence rate of federated averaging (FedAvg) with ideal communication, while the performance of existing schemes is generally limited by error floors. The theoretical results are also leveraged to propose a novel convex optimization strategy for the truncation threshold used for power control in the presence of Rayleigh fading channels. Experimental results validate the analysis, confirming the advantages of a long-term memory mechanism for the mitigation of deep fading.