This paper studies the computational offloading of video action recognition in edge computing. To achieve effective semantic information extraction and compression, following semantic communication we propose a novel spatiotemporal attention-based autoencoder (STAE) architecture, including a frame attention module and a spatial attention module, to evaluate the importance of frames and pixels in each frame. Additionally, we use entropy encoding to remove statistical redundancy in the compressed data to further reduce communication overhead. At the receiver, we develop a lightweight decoder that leverages a 3D-2D CNN combined architecture to reconstruct missing information by simultaneously learning temporal and spatial information from the received data to improve accuracy. To fasten convergence, we use a step-by-step approach to train the resulting STAE-based vision transformer (ViT_STAE) models. Experimental results show that ViT_STAE can compress the video dataset HMDB51 by 104x with only 5% accuracy loss, outperforming the state-of-the-art baseline DeepISC. The proposed ViT_STAE achieves faster inference and higher accuracy than the DeepISC-based ViT model under time-varying wireless channel, which highlights the effectiveness of STAE in guaranteeing higher accuracy under time constraints.
Radio Frequency (RF) signal-based multimodal image inpainting has recently emerged as a promising paradigm to enhance the capability of distortion-free image restoration by integrating wireless and visual information from the identical physical environment and has potential applications in fields like security and surveillance systems. In this paper, we aim to implement an RF-based image inpainting system that enables image restoration in a complex environment while maintaining high robustness and accuracy. This requires accurately converting RF signals into meaningful visual information and overcoming the challenges of RF signals in complex environments, such as multipath interference, signal attenuation, and noise. To tackle this problem, we propose Trans-Inpainter, a novel image inpainting method that utilizes the Channel State Information (CSI) of WiFi signals in combination with transformer networks to generate high-quality reconstructed images. This approach is the first to use CSI for image inpainting, which allows for extracting visual information from WiFi signals to fill in missing regions in images. To further improve Trans-Inpainter's performance, we investigate the impact of variations in CSI data on RF-based imaging ability, i.e., analyzing how the location of the CSI sensors, the combination of CSI from different sensors, and changes in temporal or frequency dimensions of CSI matrix affect the imaging quality. We compare the performance of Trans-Inpainter with RF-Inpainter, the state-of-the-art technology for RF-based multimodal image inpainting, under more realistic experimental scenarios, and with single-modality image inpainting models when only RF or image data is available, respectively. The results show that Trans-Inpainter outperforms other baseline methods in all cases.
Prolonging the lifetime of massive machine-type communication (MTC) networks is key to realizing a sustainable digitized society. Great energy savings can be achieved by accurately predicting MTC traffic followed by properly designed resource allocation mechanisms. However, selecting the proper MTC traffic predictor is not straightforward and depends on accuracy/complexity trade-offs and the specific MTC applications and network characteristics. Remarkably, the related state-of-the-art literature still lacks such debates. Herein, we assess the performance of several machine learning (ML) methods to predict Poisson and quasi-periodic MTC traffic in terms of accuracy and computational cost. Results show that the temporal convolutional network (TCN) outperforms the long-short term memory (LSTM), the gated recurrent units (GRU), and the recurrent neural network (RNN), in that order. For Poisson traffic, the accuracy gap between the predictors is larger than under quasi-periodic traffic. Finally, we show that running a TCN predictor is around three times more costly than other methods, while the training/inference time is the greatest/least.
Semantic communication (SemCom) aims to convey the meaning behind a transmitted message by transmitting only semantically-relevant information. This semantic-centric design helps to minimize power usage, bandwidth consumption, and transmission delay. SemCom and goal-oriented SemCom (or effectiveness-level SemCom) are therefore promising enablers of 6G and developing rapidly. Despite the surge in their swift development, the design, analysis, optimization, and realization of robust and intelligent SemCom as well as goal-oriented SemCom are fraught with many fundamental challenges. One of the challenges is that the lack of unified/universal metrics of SemCom and goal-oriented SemCom can stifle research progress on their respective algorithmic, theoretical, and implementation frontiers. Consequently, this survey paper documents the existing metrics -- scattered in many references -- of wireless SemCom, optical SemCom, quantum SemCom, and goal-oriented wireless SemCom. By doing so, this paper aims to inspire the design, analysis, and optimization of a wide variety of SemCom and goal-oriented SemCom systems. This article also stimulates the development of unified/universal performance assessment metrics of SemCom and goal-oriented SemCom, as the existing metrics are purely statistical and hardly applicable to reasoning-type tasks that constitute the heart of 6G and beyond.
Satellite-ground integrated digital twin networks (SGIDTNs) are regarded as innovative network architectures for reducing network congestion, enabling nearly-instant data mapping from the physical world to digital systems, and offering ubiquitous intelligence services to terrestrial users. However, the challenges, such as the pricing policy, the stochastic task arrivals, the time-varying satellite locations, mutual channel interference, and resource scheduling mechanisms between the users and cloud servers, are critical for improving quality of service in SGIDTNs. Hence, we establish a blockchain-aided Stackelberg game model for maximizing the pricing profits and network throughput in terms of minimizing overhead of privacy protection, thus performing computation offloading, decreasing channel interference, and improving privacy protection. Next, we propose a Lyapunov stability theory-based model-agnostic metalearning aided multi-agent deep federated reinforcement learning (MAML-MADFRL) framework for optimizing the CPU cycle frequency, channel selection, task-offloading decision, block size, and cloud server price, which facilitate the integration of communication, computation, and block resources. Subsequently, the extensive performance analyses show that the proposed MAMLMADFRL algorithm can strengthen the privacy protection via the transaction verification mechanism, approach the optimal time average penalty, and fulfill the long-term average queue size via lower computational complexity. Finally, our simulation results indicate that the proposed MAML-MADFRL learning framework is superior to the existing baseline methods in terms of network throughput, channel interference, cloud server profits, and privacy overhead.
Semantic communication (SemCom) has emerged as a 6G enabler while promising to minimize power usage, bandwidth consumption, and transmission delay by minimizing irrelevant information transmission. However, the benefits of such a semantic-centric design can be limited by radio frequency interference (RFI) that causes substantial semantic noise. The impact of semantic noise due to interference can be alleviated using an interference-resistant and robust (IR$^2$) SemCom design. Nevertheless, no such design exists yet. To shed light on this knowledge gap and stimulate fundamental research on IR$^2$ SemCom, the performance limits of a text SemCom system named DeepSC is studied in the presence of single- and multi-interferer RFI. By introducing a principled probabilistic framework for SemCom, we show that DeepSC produces semantically irrelevant sentences as the power of single- and multi-interferer RFI gets very large. Corroborated by Monte Carlo simulations, these performance limits offer design insights -- regarding IR$^2$ SemCom -- contrary to the theoretically unsubstantiated sentiment that SemCom techniques (such as DeepSC) work well in very low signal-to-noise ratio regimes. The performance limits also reveal the vulnerability of DeepSC and SemCom to a wireless attack using RFI. Furthermore, our introduced probabilistic framework inspires the performance analysis of many text SemCom techniques, as they are chiefly inspired by DeepSC.
Semantic communication is a novel communication paradigm that focuses on recognizing and delivering the desired meaning of messages to the destination users. Most existing works in this area focus on delivering explicit semantics, labels or signal features that can be directly identified from the source signals. In this paper, we consider the implicit semantic communication problem in which hidden relations and closely related semantic terms that cannot be recognized from the source signals need to also be delivered to the destination user. We develop a novel adversarial learning-based implicit semantic-aware communication (iSAC) architecture in which the source user, instead of maximizing the total amount of information transmitted to the channel, aims to help the recipient learn an inference rule that can automatically generate implicit semantics based on limited clue information. We prove that by applying iSAC, the destination user can always learn an inference rule that matches the true inference rule of the source messages. Experimental results show that the proposed iSAC can offer up to a 19.69 dB improvement over existing non-inferential communication solutions, in terms of symbol error rate at the destination user.
Metaverse over wireless networks is an emerging use case of the sixth generation (6G) wireless systems, posing unprecedented challenges in terms of its multi-modal data transmissions with stringent latency and reliability requirements. Towards enabling this wireless metaverse, in this article we propose a novel semantic communication (SC) framework by decomposing the metaverse into human/machine agent-specific semantic multiverses (SMs). An SM stored at each agent comprises a semantic encoder and a generator, leveraging recent advances in generative artificial intelligence (AI). To improve communication efficiency, the encoder learns the semantic representations (SRs) of multi-modal data, while the generator learns how to manipulate them for locally rendering scenes and interactions in the metaverse. Since these learned SMs are biased towards local environments, their success hinges on synchronizing heterogeneous SMs in the background while communicating SRs in the foreground, turning the wireless metaverse problem into the problem of semantic multiverse communication (SMC). Based on this SMC architecture, we propose several promising algorithmic and analytic tools for modeling and designing SMC, ranging from distributed learning and multi-agent reinforcement learning (MARL) to signaling games and symbolic AI.
While witnessing the noisy intermediate-scale quantum (NISQ) era and beyond, quantum federated learning (QFL) has recently become an emerging field of study. In QFL, each quantum computer or device locally trains its quantum neural network (QNN) with trainable gates, and communicates only these gate parameters over classical channels, without costly quantum communications. Towards enabling QFL under various channel conditions, in this article we develop a depth-controllable architecture of entangled slimmable quantum neural networks (eSQNNs), and propose an entangled slimmable QFL (eSQFL) that communicates the superposition-coded parameters of eS-QNNs. Compared to the existing depth-fixed QNNs, training the depth-controllable eSQNN architecture is more challenging due to high entanglement entropy and inter-depth interference, which are mitigated by introducing entanglement controlled universal (CU) gates and an inplace fidelity distillation (IPFD) regularizer penalizing inter-depth quantum state differences, respectively. Furthermore, we optimize the superposition coding power allocation by deriving and minimizing the convergence bound of eSQFL. In an image classification task, extensive simulations corroborate the effectiveness of eSQFL in terms of prediction accuracy, fidelity, and entropy compared to Vanilla QFL as well as under different channel conditions and various data distributions.
Recent advances in Federated Learning (FL) have paved the way towards the design of novel strategies for solving multiple learning tasks simultaneously, by leveraging cooperation among networked devices. Multi-Task Learning (MTL) exploits relevant commonalities across tasks to improve efficiency compared with traditional transfer learning approaches. By learning multiple tasks jointly, significant reduction in terms of energy footprints can be obtained. This article provides a first look into the energy costs of MTL processes driven by the Model-Agnostic Meta-Learning (MAML) paradigm and implemented in distributed wireless networks. The paper targets a clustered multi-task network setup where autonomous agents learn different but related tasks. The MTL process is carried out in two stages: the optimization of a meta-model that can be quickly adapted to learn new tasks, and a task-specific model adaptation stage where the learned meta-model is transferred to agents and tailored for a specific task. This work analyzes the main factors that influence the MTL energy balance by considering a multi-task Reinforcement Learning (RL) setup in a robotized environment. Results show that the MAML method can reduce the energy bill by at least 2 times compared with traditional approaches without inductive transfer. Moreover, it is shown that the optimal energy balance in wireless networks depends on uplink/downlink and sidelink communication efficiencies.