Abstract:Over-the-air federated learning (FL) leverages the superposition property of multiple-access channels to enable communication-efficient distributed model training. Existing integrated sensing, communication, and computation (ISCC)-enabled over-the-air FL systems typically require dedicated resources for the sensing module, inevitably compromising FL performance due to resource competition. In this paper, we propose a sensing-native over-the-air FL framework that explores built-in distributed wireless sensing capability with zero overhead per model aggregation. Specifically, the high-dimensional local gradient signals possessing favorable autocorrelation property are concurrently leveraged for target distance estimation, while the gradient statistics already required for over-the-air FL serve as a ready-made gateway to deliver locally-sensed results to the edge server for cooperative localization. To combat inter-device interference, channel fading, and communication noise, we put forth a robust trilateration-based target positioning method building upon an efficient matched-filtering-based distance estimation. Then, by explicitly characterizing the impact of imperfect model aggregation and noisy gradient-statistics transmission on the sensing-native over-the-air FL convergence, we develop a statistics-aware communication-learning co-design approach. We first derive the closed-form optimal power budgets allocated to local gradients and their statistics, based on which an efficient successive convex approximation method is proposed for receiver beamforming optimization. Simulation results show that the proposed framework simultaneously achieves superior learning and sensing performance compared to representative baselines.
Abstract:Radio maps provide the essential foundation for low altitude networking systems. Unlike terrestrial radio maps that are typically generated via drive test measurements, mapping the air-ground environment requires the deployment of unmanned aerial vehicles (UAVs). This shift introduces two formidable challenges in uncharted 3D scenarios. First, sparse radio measurements and incomplete geometric observations hinder accurate reconstruction. Second, the large 3D action space and strict power constraints from high spectrum scanner energy consumption make informative exploration difficult. To address these issues, this paper proposes 3D uncertainty aware radio active mapping (3D-URAM), a closed loop active perception framework that decouples the mapping process into two offline trained stages. In Stage I, a Bayesian UNet is developed to recover radio maps from sparse measurements and partial geometry while providing calibrated predictive uncertainty. In Stage II, a dynamic probabilistic roadmap and a transformer based waypoint selection policy trained via proximal policy optimization maximize long horizon uncertainty reduction under travel budgets. Experimental results demonstrate that 3D-URAM reduces reconstruction error by over 50% compared to representative baselines. Real-world field tests within a 300mx200mx100m space also validate the potential of active radio map reconstruction.
Abstract:Driven by the emerging low-altitude economy, uncrewed aerial vehicle (UAV) swarms offer flexible integrated air-ground access and backhaul. However, providing seamless connectivity is difficult due to the interdependent dynamics of user mobility and building blockages in these 3D scenarios. These factors create rapidly shifting bottlenecks in end-to-end paths. Furthermore, the multi-dimensional nature of joint control limits the effectiveness of traditional heuristics. To address these challenges, a \textbf{\underline{M}}ulti-Scale \textbf{\underline{R}}adio \textbf{\underline{M}}ap-\textbf{\underline{G}}uided (MRMG) framework is proposed. The MRMG framework handles heterogeneous dynamics by integrating three distinct levels of radio information: global-level maps provide regional coverage insights, local-level maps capture neighborhood-scale service conditions, and link-level maps characterize high-resolution channel features. This design effectively decouples macro-movement from micro-link adaptation. To yield long-term performance improvements, A multi-agent reinforcement learning (MARL) controller learns cooperative policies for UAV movement, next-hop selection, and transmit-power control. Simulation results show that the MRMG framework not only improves network throughput but also significantly bolsters cell-edge service, nearly doubling the 5th-percentile user rate.
Abstract:While radio-frequency (RF) field synthesis is fundamental to wireless networking, current approaches remain constrained by static assumptions, leaving them unable to track the rapid multipath reorganization of dynamic scenes. Modeling these transitions requires addressing two coupled challenges: explicit temporal representation and the capture of discrete path lifecycles. To bridge this gap, Temporal-Evolving Radio Field Synthesis (TeRFS) is introduced. TeRFS utilizes an anisotropic spherical Gaussian (ASG) directional basis to represent sparse, sharp angular structures, bound to analytical temporal envelopes that regulate path lifecycles. This formulation induces a mathematical birth-and-death mechanism, enabling individual multipath trajectories to emerge and vanish with temporal precision, a capability beyond the reach of standard smooth interpolation. Evaluations demonstrate that TeRFS outperforms state-of-the-art (SOTA) baselines, achieving an 11.5% reduction in mean squared error (MSE) alongside a 6.9 times training speedup. Even in environments characterized by extreme structural mutation, TeRFS maintains robust tracking of dynamic reorganizations, limiting median absolute error to 1.52 dB and establishing its utility for high-mobility wireless applications.
Abstract:Efficient beam alignment is fundamental to high-throughput and reliable connectivity in Vehicle-to-Everything (V2X) systems. However, conventional beam management in dynamic vehicular topologies incurs prohibitive alignment overhead and struggles to maintain robust links under rapid mobility. To overcome these challenges, this paper proposes a distributed multimodal graph beam alignment (GBA) framework. The core innovation lies in leveraging onboard multimodal sensing data to predict implicit feedback while employing graph neural networks to coordinate multi-user alignment, thereby jointly enhancing scalability and drastically reducing overhead. The architecture adopts a dual-network design with GBA-RSU and GBA-Vehicle units, optimized through a hybrid strategy of centralized learning and federated learning (FL) to balance global performance with local privacy. Furthermore, a dedicated data augmentation (DA) scheme is introduced to address multimodal data imbalance issues in vehicular networks. Negative augmentation applies dominant modality dropout to bolster robustness, while positive augmentation generates underrepresented samples to mitigate label imbalance. Numerical results demonstrate that GBA maintains a competitive sum rate on par with high-resolution codebook-based feedback yet reduces beam alignment overhead by over 90\% and scales efficiently in mobile scenarios. Notably, integrating DA enables GBA to consistently outperform state-of-the-art FL-based alignment benchmarks, with particularly pronounced gains under severe label and modality imbalance, establishing a practical solution for V2X beam management.
Abstract:Current learning-based wireless methods struggle with generalization due to the fragmented processing of communication and sensing data. WiFo-MiSAC addresses this as a task-agnostic foundation model that tokenizes heterogeneous signals into a unified space for self-supervised pre-training. A shared-specific disentangled mixture-of-experts (SS-DMoE) architecture is employed to decouple modality-shared and modality-specific representations, facilitating interaction without cross-modal interference. By combining masked reconstruction with contrastive alignment, the model achieves state-of-the-art performance across downstream tasks, including beam prediction and channel estimation. Experimental results demonstrate robust few-shot adaptation and seamless integration of new modalities, positioning WiFo-MiSAC as a scalable backbone for future integrated sensing and communication systems.
Abstract:Fluid antenna systems (FAS) provide extra position agile spatial diversity for integrated sensing and communication (ISAC), by jointly optimizing the port selection and precoding. However, this optimization is challenging in air ground networks due to the intricate dual objective Pareto frontier, complex self-interference, and prohibitive channel state information overhead. To overcome these bottlenecks, this work proposes a novel grey box multi objective Bayesian optimization framework to address the joint design of discrete port selection and ISAC precoding. Unlike black box methods, this architecture explicitly leverages known physical system models to learn unknown channel constituents, dramatically reducing sample complexity. To navigate high dimensional combinatorial spaces, an adaptive trust region mechanism powered by expected hypervolume improvement (EHI) acquisition is implemented. Furthermore, the framework incorporates a spatio-temporal tracking strategy to handle the continuous mobility of users and targets, robustly capturing the drifting optimum in time varying environments. Simulations demonstrate that this framework achieves significantly faster convergence and discovers superior Pareto optimal configurations, validating its efficiency for dynamic real time FAS-ISAC deployments.
Abstract:Accurate channel state information (CSI) is vital for multiple-input multiple-output (MIMO) systems. However, superimposed pilots (SIP), which reduce overhead, introduce severe pilot contamination and data interference, complicating joint channel estimation and data detection. This paper proposes a conditional flow matching receiver (CFM-Rx), an unsupervised generative framework that learns directly from received signals, eliminating the need for labeled data and improving adaptability across diverse system settings. By leveraging flow-based generative modeling, CFM-Rx enables deterministic, low-latency inference and exploits model invertibility to capture the bidirectional nature of signal propagation. This framework unifies flow matching with score-based diffusion modeling via a moment-consistent ordinary differential equation (ODE), replacing stochastic differential equation (SDE) sampling with a deterministic and efficient process. Furthermore, it integrates receiver-side priors to ensure stable, data-consistent inference. Extensive simulation results across various MIMO configurations demonstrate that CFM-Rx consistently outperforms conventional estimators and state-of-the-art data-driven receivers, achieving notable gains in channel estimation accuracy and symbol detection robustness, particularly under severe pilot contamination.
Abstract:The expansion of the low-altitude economy is contingent on reliable cellular connectivity for unmanned aerial vehicles (UAVs). A key challenge in pre-flight planning is predicting communication link quality along proposed and pre-defined routes, a task hampered by sparse measurements that render existing radio map methods ineffective. This paper introduces a transfer learning framework for high-fidelity route-level radio map prediction. Our key insight is to leverage abundant crowdsourced ground signals as auxiliary supervision. To bridge the significant domain gap between ground and aerial data and address spatial sparsity, our framework learns general propagation priors from simulation, performs adversarial alignment of the feature spaces, and is fine-tuned on limited real UAV measurements. Extensive experiments on a real-world dataset from Meituan show that our method achieves over 50% higher accuracy in predicting Route RSRP compared to state-of-the-art baselines.
Abstract:Accurate precoding in massive multiple-input multiple-output (MIMO) frequency-division duplexing (FDD) systems relies on efficient channel state information (CSI) acquisition. End-to-end learning frameworks improve performance by jointly optimizing this process, but they lack scalability and fail to generalize across different system configurations, such as varying numbers of antennas and users. To overcome this limitation, we introduce WiFo-E, a wireless foundation model designed for scalable end-to-end precoding. WiFo-E employs multi-task pretraining on a diverse set of configurations to learn transferable representations of underlying wireless principles. Central to the model is a sparse Mixture-of-Experts (MoE) Transformer architecture, which mitigates task interference and enhances training efficiency by activating specialized parameter subsets adaptively. Extensive simulations demonstrate that WiFo-E outperforms conventional per-configuration training and shows strong generalization to unseen system configurations, providing a flexible and efficient foundation for adaptive massive MIMO precoding.