Abstract:Immersive virtual reality (VR) applications impose stringent requirements on latency, energy efficiency, and computational resources, particularly in multi-user interactive scenarios. To address these challenges, we introduce the concept of spatial computing communications (SCC), a framework designed to meet the latency and energy demands of multi-user VR over distributed mobile edge computing (MEC) networks. SCC jointly represents the physical space, defined by users and base stations, and the virtual space, representing shared immersive environments, using a probabilistic model of user dynamics and resource requirements. The resource deployment task is then formulated as a multi-objective combinatorial optimization (MOCO) problem that simultaneously minimizes system latency and energy consumption across distributed MEC resources. To solve this problem, we propose MO-CMPO, a multi-objective consistency model with policy optimization that integrates supervised learning and reinforcement learning (RL) fine-tuning guided by preference weights. Leveraging a sparse graph neural network (GNN), MO-CMPO efficiently generates Pareto-optimal solutions. Simulations with real-world New Radio base station datasets demonstrate that MO-CMPO achieves superior hypervolume performance and significantly lower inference latency than baseline methods. Furthermore, the analysis reveals practical deployment patterns: latency-oriented solutions favor local MEC execution to reduce transmission delay, while energy-oriented solutions minimize redundant placements to save energy.
Abstract:Collaborative perception emphasizes enhancing environmental understanding by enabling multiple agents to share visual information with limited bandwidth resources. While prior work has explored the empirical trade-off between task performance and communication volume, a significant gap remains in the theoretical foundation. To fill this gap, we draw on information theory and introduce a pragmatic rate-distortion theory for multi-agent collaboration, specifically formulated to analyze performance-communication trade-off in goal-oriented multi-agent systems. This theory concretizes two key conditions for designing optimal communication strategies: supplying pragmatically relevant information and transmitting redundancy-less messages. Guided by these two conditions, we propose RDcomm, a communication-efficient collaborative perception framework that introduces two key innovations: i) task entropy discrete coding, which assigns features with task-relevant codeword-lengths to maximize the efficiency in supplying pragmatic information; ii) mutual-information-driven message selection, which utilizes mutual information neural estimation to approach the optimal redundancy-less condition. Experiments on 3D object detection and BEV segmentation demonstrate that RDcomm achieves state-of-the-art accuracy on DAIR-V2X and OPV2V, while reducing communication volume by up to 108 times. The code will be released.
Abstract:Direct satellite-to-device communication is a promising future direction due to its lower latency and enhanced efficiency. However, intermittent and unpredictable terrestrial interference significantly affects system reliability and performance. Continuously employing sophisticated interference mitigation techniques is practically inefficient. Motivated by the periodic idle intervals characteristic of burst-mode satellite transmissions, this paper investigates online interference detection frameworks specifically tailored for satellite-to-device scenarios. We first rigorously formulate interference detection as a binary hypothesis testing problem, leveraging differences between Rayleigh (no interference) and Rice (interference present) distributions. Then, we propose a cumulative sum (CUSUM)-based online detector for scenarios with known interference directions, explicitly characterizing the trade-off between detection latency and false alarm rate, and establish its asymptotic optimality. For practical scenarios involving unknown interference direction, we further propose a generalized likelihood ratio (GLR)-based detection method, jointly estimating interference direction via the Root-MUSIC algorithm. Numerical results validate our theoretical findings and demonstrate that our proposed methods achieve high detection accuracy with remarkably low latency, highlighting their practical applicability in future satellite-to-device communication systems.
Abstract:This letter proposes a channel estimation method for reconfigurable intelligent surface (RIS)-assisted systems through a novel diffusion model (DM) framework. We reformulate the channel estimation problem as a denoising process, which aligns with the reverse process of the DM. To overcome the inherent randomness in the reverse process of conventional DM approaches, we adopt a deterministic sampling strategy with a step alignment mechanism that ensures the accuracy of channel estimation while adapting to different signal-to-noise ratio (SNR). Furthermore, to reduce the number of parameters of the U-Net, we meticulously design a lightweight network that achieves comparable performance, thereby enhancing the practicality of our proposed method. Extensive simulations demonstrate superior performance over a wide range of SNRs compared to baselines. For instance, the proposed method achieves performance improvements of up to 13.5 dB in normalized mean square error (NMSE) at SNR = 0 dB. Notably, the proposed lightweight network exhibits almost no performance loss compared to the original U-Net, while requiring only 6.59\% of its parameters.
Abstract:Fluid antenna (FA), as an emerging antenna technology, fully exploits spatial diversity. This paper integrates FA with the receive spatial modulation (RSM) scheme and proposes a novel FA-empowered RSM (FA-RSM) system. In this system, the transmitter is equipped with an FA that simultaneously activates multiple ports to transmit precoded signals. We address three key challenges in the FA-RSM system: port selection, theoretical analysis, and detection. First, for port selection, an optimal algorithm from a capacity maximization perspective are proposed, followed by two low-complexity alternatives. Second, for theoretical analysis, performance evaluation metrics are provided for port selection, which demonstrate that increasing the number of activated ports enhances system performance. Third, regarding detection, two low-complexity detectors are proposed. Simulation results confirm that the FA-RSM system significantly outperforms the conventional RSM system. The proposed low-complexity port selection algorithms facilitate minimal performance degradation. Moreover, while activating additional ports improves performance, the gain gradually saturates due to inherent spatial correlation, highlighting the importance of effective port selection in reducing system complexity and cost. Finally, both proposed detectors achieve near-optimal detection performance with low computational complexity, emphasizing the receiver-friendly nature of the FA-RSM system.
Abstract:Diffusion models (DMs) have recently achieved significant success in wireless communications systems due to their denoising capabilities. The broadcast nature of wireless signals makes them susceptible not only to Gaussian noise, but also to unaware interference. This raises the question of whether DMs can effectively mitigate interference in wireless semantic communication systems. In this paper, we model the interference cancellation problem as a maximum a posteriori (MAP) problem over the joint posterior probability of the signal and interference, and theoretically prove that the solution provides excellent estimates for the signal and interference. To solve this problem, we develop an interference cancellation diffusion model (ICDM), which decomposes the joint posterior into independent prior probabilities of the signal and interference, along with the channel transition probablity. The log-gradients of these distributions at each time step are learned separately by DMs and accurately estimated through deriving. ICDM further integrates these gradients with advanced numerical iteration method, achieving accurate and rapid interference cancellation. Extensive experiments demonstrate that ICDM significantly reduces the mean square error (MSE) and enhances perceptual quality compared to schemes without ICDM. For example, on the CelebA dataset under the Rayleigh fading channel with a signal-to-noise ratio (SNR) of $20$ dB and signal to interference plus noise ratio (SINR) of 0 dB, ICDM reduces the MSE by 4.54 dB and improves the learned perceptual image patch similarity (LPIPS) by 2.47 dB.
Abstract:Movable antenna (MA) has shown significant potential for improving the performance of integrated sensing and communication (ISAC) systems. In this paper, we model an MA-aided ISAC system operating in a communication full-duplex mono-static sensing framework. The self-interference channel is modeled as a function of the antenna position vectors under the near-field channel condition. We develop an optimization problem to maximize the weighted sum of downlink and uplink communication rates alongside the mutual information relevant to the sensing task. To address this highly non-convex problem, we employ the fractional programming (FP) method and propose an alternating optimization (AO)-based algorithm that jointly optimizes the beamforming, user power allocation, and antenna positions at the transceivers. Given the sensitivity of the AO-based algorithm to the initial antenna positions, a PSO-based algorithm is proposed to explore superior sub-optimal antenna positions within the feasible region. Numerical results indicate that the proposed algorithms enable the MA system to effectively leverage the antenna position flexibility for accurate beamforming in a complex ISAC scenario. This enhances the system's self-interference cancellation (SIC) capabilities and markedly improves its overall performance and reliability compared to conventional fixed-position antenna designs.
Abstract:Designing a 6G-oriented universal model capable of processing multi-modal data and executing diverse air interface tasks has emerged as a common goal in future wireless systems. Building on our prior work in communication multi-modal alignment and telecom large language model (LLM), we propose a scalable, task-aware artificial intelligence-air interface multi-modal universal model (AI2MMUM), which flexibility and effectively perform various physical layer tasks according to subtle task instructions. The LLM backbone provides robust contextual comprehension and generalization capabilities, while a fine-tuning approach is adopted to incorporate domain-specific knowledge. To enhance task adaptability, task instructions consist of fixed task keywords and learnable, implicit prefix prompts. Frozen radio modality encoders extract universal representations and adapter layers subsequently bridge radio and language modalities. Moreover, lightweight task-specific heads are designed to directly output task objectives. Comprehensive evaluations demonstrate that AI2MMUM achieves SOTA performance across five representative physical environment/wireless channel-based downstream tasks using the WAIR-D and DeepMIMO datasets.
Abstract:We present InstantSticker, a disentangled reconstruction pipeline based on Image-Based Lighting (IBL), which focuses on highly realistic decal blending, simulates stickers attached to the reconstructed surface, and allows for instant editing and real-time rendering. To achieve stereoscopic impression of the decal, we introduce shadow factor into IBL, which can be adaptively optimized during training. This allows the shadow brightness of surfaces to be accurately decomposed rather than baked into the diffuse color, ensuring that the edited texture exhibits authentic shading. To address the issues of warping and blurriness in previous methods, we apply As-Rigid-As-Possible (ARAP) parameterization to pre-unfold a specified area of the mesh and use the local UV mapping combined with a neural texture map to enhance the ability to express high-frequency details in that area. For instant editing, we utilize the Disney BRDF model, explicitly defining material colors with 3-channel diffuse albedo. This enables instant replacement of albedo RGB values during the editing process, avoiding the prolonged optimization required in previous approaches. In our experiment, we introduce the Ratio Variance Warping (RVW) metric to evaluate the local geometric warping of the decal area. Extensive experimental results demonstrate that our method surpasses previous decal blending methods in terms of editing quality, editing speed and rendering speed, achieving the state-of-the-art.
Abstract:This paper investigates joint device activity detection and channel estimation for grant-free random access in Low-earth orbit (LEO) satellite communications. We consider uplink communications from multiple single-antenna terrestrial users to a LEO satellite equipped with a uniform planar array of multiple antennas, where orthogonal frequency division multiplexing (OFDM) modulation is adopted. To combat the severe Doppler shift, a transmission scheme is proposed, where the discrete prolate spheroidal basis expansion model (DPS-BEM) is introduced to reduce the number of unknown channel parameters. Then the vector approximate message passing (VAMP) algorithm is employed to approximate the minimum mean square error estimation of the channel, and the Markov random field is combined to capture the channel sparsity. Meanwhile, the expectation-maximization (EM) approach is integrated to learn the hyperparameters in priors. Finally, active devices are detected by calculating energy of the estimated channel. Simulation results demonstrate that the proposed method outperforms conventional algorithms in terms of activity error rate and channel estimation precision.