Abstract:Mixture of Experts (MoE) has emerged as a promising paradigm for scaling model capacity while preserving computational efficiency, particularly in large-scale machine learning architectures such as large language models (LLMs). Recent advances in MoE have facilitated its adoption in wireless networks to address the increasing complexity and heterogeneity of modern communication systems. This paper presents a comprehensive survey of the MoE framework in wireless networks, highlighting its potential in optimizing resource efficiency, improving scalability, and enhancing adaptability across diverse network tasks. We first introduce the fundamental concepts of MoE, including various gating mechanisms and the integration with generative AI (GenAI) and reinforcement learning (RL). Subsequently, we discuss the extensive applications of MoE across critical wireless communication scenarios, such as vehicular networks, unmanned aerial vehicles (UAVs), satellite communications, heterogeneous networks, integrated sensing and communication (ISAC), and mobile edge networks. Furthermore, key applications in channel prediction, physical layer signal processing, radio resource management, network optimization, and security are thoroughly examined. Additionally, we present a detailed overview of open-source datasets that are widely used in MoE-based models to support diverse machine learning tasks. Finally, this survey identifies crucial future research directions for MoE, emphasizing the importance of advanced training techniques, resource-aware gating strategies, and deeper integration with emerging 6G technologies.
Abstract:In this paper, we propose a novel physics-informed generative learning approach, termed RadioDiff-$\bm{k^2}$, for accurate and efficient multipath-aware radio map (RM) construction. As wireless communication evolves towards environment-aware paradigms, driven by the increasing demand for intelligent and proactive optimization in sixth-generation (6G) networks, accurate construction of RMs becomes crucial yet highly challenging. Conventional electromagnetic (EM)-based methods, such as full-wave solvers and ray-tracing approaches, exhibit substantial computational overhead and limited adaptability to dynamic scenarios. Although, existing neural network (NN) approaches have efficient inferencing speed, they lack sufficient consideration of the underlying physics of EM wave propagation, limiting their effectiveness in accurately modeling critical EM singularities induced by complex multipath environments. To address these fundamental limitations, we propose a novel physics-inspired RM construction method guided explicitly by the Helmholtz equation, which inherently governs EM wave propagation. Specifically, we theoretically establish a direct correspondence between EM singularities, which correspond to the critical spatial features influencing wireless propagation, and regions defined by negative wave numbers in the Helmholtz equation. Based on this insight, we design an innovative dual generative diffusion model (DM) framework comprising one DM dedicated to accurately inferring EM singularities and another DM responsible for reconstructing the complete RM using these singularities along with environmental contextual information. Our physics-informed approach uniquely combines the efficiency advantages of data-driven methods with rigorous physics-based EM modeling, significantly enhancing RM accuracy, particularly in complex propagation environments dominated by multipath effects.
Abstract:Channel prediction is crucial for high-mobility vehicular networks, as it enables the anticipation of future channel conditions and the proactive adjustment of communication strategies. However, achieving accurate vehicular channel prediction is challenging due to significant Doppler effects and rapid channel variations resulting from high-speed vehicle movement and complex propagation environments. In this paper, we propose a novel delay-Doppler (DD) domain channel prediction framework tailored for high-mobility vehicular networks. By transforming the channel representation into the DD domain, we obtain an intuitive, sparse, and stable depiction that closely aligns with the underlying physical propagation processes, effectively reducing the complex vehicular channel to a set of time-series parameters with enhanced predictability. Furthermore, we leverage the large artificial intelligence (AI) model to predict these DD-domain time-series parameters, capitalizing on their advanced ability to model temporal correlations. The zero-shot capability of the pre-trained large AI model facilitates accurate channel predictions without requiring task-specific training, while subsequent fine-tuning on specific vehicular channel data further improves prediction accuracy. Extensive simulation results demonstrate the effectiveness of our DD-domain channel prediction framework and the superior accuracy of the large AI model in predicting time-series channel parameters, thereby highlighting the potential of our approach for robust vehicular communication systems.
Abstract:Large Language Model (LLM) at mobile devices and its potential applications never fail to fascinate. However, on-device LLM fine-tuning poses great challenges due to extremely high memory requirements and slow training speeds. Even with parameter-efficient fine-tuning (PEFT) methods that update only a small subset of parameters, resource-constrained mobile devices cannot afford them. In this paper, we propose MobiLLM to enable memory-efficient transformer LLM fine-tuning on a mobile device via server-assisted side-tuning. Particularly, MobiLLM allows the resource-constrained mobile device to retain merely a frozen backbone model, while offloading the memory and computation-intensive backpropagation of a trainable side-network to a high-performance server. Unlike existing fine-tuning methods that keep trainable parameters inside the frozen backbone, MobiLLM separates a set of parallel adapters from the backbone to create a backpropagation bypass, involving only one-way activation transfers from the mobile device to the server with low-width quantization during forward propagation. In this way, the data never leaves the mobile device while the device can remove backpropagation through the local backbone model and its forward propagation can be paralyzed with the server-side execution. Thus, MobiLLM preserves data privacy while significantly reducing the memory and computational burdens for LLM fine-tuning. Through extensive experiments, we demonstrate that MobiLLM can enable a resource-constrained mobile device, even a CPU-only one, to fine-tune LLMs and significantly reduce convergence time and memory usage.
Abstract:Foundation model (FM) powered agent services are regarded as a promising solution to develop intelligent and personalized applications for advancing toward Artificial General Intelligence (AGI). To achieve high reliability and scalability in deploying these agent services, it is essential to collaboratively optimize computational and communication resources, thereby ensuring effective resource allocation and seamless service delivery. In pursuit of this vision, this paper proposes a unified framework aimed at providing a comprehensive survey on deploying FM-based agent services across heterogeneous devices, with the emphasis on the integration of model and resource optimization to establish a robust infrastructure for these services. Particularly, this paper begins with exploring various low-level optimization strategies during inference and studies approaches that enhance system scalability, such as parallelism techniques and resource scaling methods. The paper then discusses several prominent FMs and investigates research efforts focused on inference acceleration, including techniques such as model compression and token reduction. Moreover, the paper also investigates critical components for constructing agent services and highlights notable intelligent applications. Finally, the paper presents potential research directions for developing real-time agent services with high Quality of Service (QoS).
Abstract:In this article, we present a novel user-centric service provision for immersive communications (IC) in 6G to deal with the uncertainty of individual user behaviors while satisfying unique requirements on the quality of multi-sensory experience. To this end, we propose a data-oriented approach for network resource management, featuring personalized data management that can support network modeling tailored to different user demands. Our approach leverages the digital twin (DT) technique as a key enabler. Particularly, a DT is established for each user, and the data attributes in the DT are customized based on the characteristics of the user. The DT functions, corresponding to various data operations, are customized in the development, evaluation, and update of network models to meet unique user demands. A trace-driven case study demonstrates the effectiveness of our approach in achieving user-centric IC and the significance of personalized data management in 6G.
Abstract:Integrated sensing and communications (ISAC) is expected to be a key technology for 6G, and channel state information (CSI) based sensing is a key component of ISAC. However, current research on ISAC focuses mainly on improving sensing performance, overlooking security issues, particularly the unauthorized sensing of users. In this paper, we propose a secure sensing system (DFSS) based on two distinct diffusion models. Specifically, we first propose a discrete conditional diffusion model to generate graphs with nodes and edges, guiding the ISAC system to appropriately activate wireless links and nodes, which ensures the sensing performance while minimizing the operation cost. Using the activated links and nodes, DFSS then employs the continuous conditional diffusion model to generate safeguarding signals, which are next modulated onto the pilot at the transmitter to mask fluctuations caused by user activities. As such, only ISAC devices authorized with the safeguarding signals can extract the true CSI for sensing, while unauthorized devices are unable to achieve the same sensing. Experiment results demonstrate that DFSS can reduce the activity recognition accuracy of the unauthorized devices by approximately 70%, effectively shield the user from the unauthorized surveillance.
Abstract:Radio map (RM) is a promising technology that can obtain pathloss based on only location, which is significant for 6G network applications to reduce the communication costs for pathloss estimation. However, the construction of RM in traditional is either computationally intensive or depends on costly sampling-based pathloss measurements. Although the neural network (NN)-based method can efficiently construct the RM without sampling, its performance is still suboptimal. This is primarily due to the misalignment between the generative characteristics of the RM construction problem and the discrimination modeling exploited by existing NN-based methods. Thus, to enhance RM construction performance, in this paper, the sampling-free RM construction is modeled as a conditional generative problem, where a denoised diffusion-based method, named RadioDiff, is proposed to achieve high-quality RM construction. In addition, to enhance the diffusion model's capability of extracting features from dynamic environments, an attention U-Net with an adaptive fast Fourier transform module is employed as the backbone network to improve the dynamic environmental features extracting capability. Meanwhile, the decoupled diffusion model is utilized to further enhance the construction performance of RMs. Moreover, a comprehensive theoretical analysis of why the RM construction is a generative problem is provided for the first time, from both perspectives of data features and NN training methods. Experimental results show that the proposed RadioDiff achieves state-of-the-art performance in all three metrics of accuracy, structural similarity, and peak signal-to-noise ratio. The code is available at https://github.com/UNIC-Lab/RadioDiff.
Abstract:This article presents a digital twin (DT)-enhanced reinforcement learning (RL) framework aimed at optimizing performance and reliability in network resource management, since the traditional RL methods face several unified challenges when applied to physical networks, including limited exploration efficiency, slow convergence, poor long-term performance, and safety concerns during the exploration phase. To deal with the above challenges, a comprehensive DT-based framework is proposed to enhance the convergence speed and performance for unified RL-based resource management. The proposed framework provides safe action exploration, more accurate estimates of long-term returns, faster training convergence, higher convergence performance, and real-time adaptation to varying network conditions. Then, two case studies on ultra-reliable and low-latency communication (URLLC) services and multiple unmanned aerial vehicles (UAV) network are presented, demonstrating improvements of the proposed framework in performance, convergence speed, and training cost reduction both on traditional RL and neural network based Deep RL (DRL). Finally, the article identifies and explores some of the research challenges and open issues in this rapidly evolving field.
Abstract:To maintain high perception performance among connected and autonomous vehicles (CAVs), in this paper, we propose an accuracy-aware and resource-efficient raw-level cooperative sensing and computing scheme among CAVs and road-side infrastructure. The scheme enables fined-grained partial raw sensing data selection, transmission, fusion, and processing in per-object granularity, by exploiting the parallelism among object classification subtasks associated with each object. A supervised learning model is trained to capture the relationship between the object classification accuracy and the data quality of selected object sensing data, facilitating accuracy-aware sensing data selection. We formulate an optimization problem for joint sensing data selection, subtask placement and resource allocation among multiple object classification subtasks, to minimize the total resource cost while satisfying the delay and accuracy requirements. A genetic algorithm based iterative solution is proposed for the optimization problem. Simulation results demonstrate the accuracy awareness and resource efficiency achieved by the proposed cooperative sensing and computing scheme, in comparison with benchmark solutions.