Abstract:Deep reinforcement learning (DRL) has long been a promising solution for sequential resource management in wireless networks. However, conventional DRL methods are fundamentally limited by their reliance on unimodal policy distributions, inefficient exploration in high-dimensional action spaces, and poor adaptability to dynamic and heterogeneous environments. Meanwhile, diffusion models (DMs) as one of the most powerful families of generative AI have demonstrted remarkable capabilities in modeling complex, multi-modal data distributions across diverse domains. The integration of DMs and DRL has opened a new and rapidly growing research direction, in which DM-enabled policies substantially enhance decision quality by capturing the complex, discontinuous, and multimodal action structures inherent in wireless resource management. In this paper, we present a comprehensive survey of DM-enabled DRL algorithms and their applications for various issues in wireless networks. Particularly, we first provide the theoretical background of DM and present different DM-enabled DRL algorithms. We then systematically review applications of DM-enabled DRL for across computation offloading in mobile edge computing, UAV-assisted, vehicular, and AIGC-driven systems, as well as wireless resource allocation, physical-layer security, and robotics and UAV planning. We conclude the paper by higlight future research directions.
Abstract:The pinching-antenna systems (PASS) enable blockage mitigation in urban micro (UMi) networks through flexible antenna placement. However, the joint optimization of antenna positions and beamforming precoding is inherently nonconvex and becomes significantly more challenging under user mobility. To address this issue, we propose a bilevel optimization framework for dynamic antenna positioning and beamforming precoding design. In the outer level, a soft actor-critic (SAC) agent learns a continuous control policy for real-time antenna positioning, while in the inner level, zero-forcing (ZF) precoding is applied based on the instantaneous effective channel. Numerical results demonstrate that the proposed framework significantly improves spectral efficiency (SE) and enhances robustness against user mobility and random blockages.