Abstract:Ray tracing (RT) has emerged as a key tool for propagation channel modeling and network planning. Conventional RT is based on electromagnetic (EM) wave theory and its application relies on detailed mesh-based environment representations and material properties. In realistic environments, limited environmental geometry and material uncertainties hinder its scalability to complex scenarios. In this paper, we propose a novel physics aware neural RT surrogate named PointNeRT to address these limitations. The proposed model directly takes point clouds as environmental input, and efficiently reconstruct multipath without explicitly constructing mesh models or manually defining EM interaction rules. PointNeRT adopts a hop-by-hop modeling strategy guided by physical interaction constraints. It supports sequential prediction of multipath propagation and power attenuation. Numerical results and experiments demonstrate that the proposed method implicitly captures surface normal characteristics and EM material effects. It further achieves robust generalization in mobility scenarios and provides a physics-guided neural modeling of multipath propagation.
Abstract:In urban environments, vehicle-to-everything (V2X) communications require accurate wireless channel characterization. This requirement is particularly critical at street-canyon intersections, where building blockage and rich multipath propagation can severely degrade link reliability. Due to its unique environmental layout, the channel characteristics in urban canyon are influenced by building distribution. However, this feature has not been well captured in existing channel models. In this paper, we propose an environment-related statistical channel model based on 5.8~GHz channel measurements. We construct a composite environmental factor to characterize environmental differences in intersections. Then, the factor is incorporated into 3GPP path-loss model and further linked to small-scale channel parameters. Finally, accuracy of the proposed model is validated using second-order channel statistics. The results show that the proposed model can effectively characterize propagation properties of urban street-canyon intersection channels with different building conditions. The proposed model provides a physically interpretable and statistically effective framework for channel simulation and performance evaluation in urban vehicular scenarios.
Abstract:The deep integration of communication with intelligence and sensing, as a defining vision of 6G, renders environment-aware channel prediction a key enabling technology. As a representative 6G application, vehicular communications require accurate and forward-looking channel prediction under stringent reliability, latency, and adaptability demands. Traditional empirical and deterministic models remain limited in balancing accuracy, generalization, and deployability, while the growing availability of onboard and roadside sensing devices offers a promising source of environmental priors. This paper proposes an environment-aware channel prediction framework based on multimodal visual feature fusion. Using GPS data and vehicle-side panoramic RGB images, together with semantic segmentation and depth estimation, the framework extracts semantic, depth, and position features through a three-branch architecture and performs adaptive multimodal fusion via a squeeze-excitation attention gating module. For 360-dimensional angular power spectrum (APS) prediction, a dedicated regression head and a composite multi-constraint loss are further designed. As a result, joint prediction of path loss (PL), delay spread (DS), azimuth spread of arrival (ASA), azimuth spread of departure (ASD), and APS is achieved. Experiments on a synchronized urban V2I measurement dataset yield the best root mean square error (RMSE) of 3.26 dB for PL, RMSEs of 37.66 ns, 5.05 degrees, and 5.08 degrees for DS, ASA, and ASD, respectively, and mean/median APS cosine similarities of 0.9342/0.9571, demonstrating strong accuracy, generalization, and practical potential for intelligent channel prediction in 6G vehicular communications.
Abstract:Site-specific channel inference plays a critical role in the design and evaluation of next-generation wireless communication systems by considering the surrounding propagation environment. However, traditional methods are unscalable, while existing AI-based approaches using satellite image are confined to predicting large-scale fading parameters, lacking the capacity to reconstruct the complete channel impulse response (CIR). To address this limitation, we propose a deep learning-based site-specific channel inference framework using satellite images to predict structured Tapped Delay Line (TDL) parameters. We first establish a joint channel-satellite dataset based on measurements. Then, a novel deep learning network is developed to reconstruct the channel parameters. Specifically, a cross-attention-fused dual-branch pipeline extracts macroscopic and microscopic environmental features, while a recurrent tracking module captures the long-term dynamic evolution of multipath components. Experimental results demonstrate that the proposed method achieves high-quality reconstruction of the CIR in unseen scenarios, with a Power Delay Profile (PDP) Average Cosine Similarity exceeding 0.96. This work provides a pathway toward site-specific channel inference for future dynamic wireless networks.
Abstract:Accurate path loss prediction is crucial for wireless network planning and optimization in suburban environments with complex terrain variation and diverse land cover. This paper proposes a model assisted hybrid path loss prediction method that introduces an environment adaptive compensation on top of the classic close-in free-space reference distance (CI) path loss model. By jointly predicting the path loss exponent and a compensation term, the proposed approach dynamically adjusts the empirical trend. To improve the effectiveness of environmental representation, three environmental image organization schemes are constructed and evaluated. Experiments on measurement data collected in Pingtan Island show that the proposed method outperforms the CI model and a conventional model assisted baseline, achieving a test root mean square error of 4.04 dB.
Abstract:This paper proposes a novel paradigm centered on Artificial Intelligence (AI)-empowered propagation channel prediction to address the limitations of traditional channel modeling. We present a comprehensive framework that deeply integrates heterogeneous environmental data and physical propagation knowledge into AI models for site-specific channel prediction, which referred to as channel inference. By leveraging AI to infer site-specific wireless channel states, the proposed paradigm enables accurate prediction of channel characteristics at both link and area levels, capturing spatio-temporal evolution of radio propagation. Some novel strategies to realize the paradigm are introduced and discussed, including AI-native and AI-hybrid inference approaches. This paper also investigates how to enhance model generalization through transfer learning and improve interpretability via explainable AI techniques. Our approach demonstrates significant practical efficacy, achieving an average path loss prediction root mean square error (RMSE) of $\sim$ 4 dB and reducing training time by 60\%-75\%. This new modeling paradigm provides a foundational pathway toward high-fidelity, generalizable, and physically consistent propagation channel prediction for future communication networks.




Abstract:Recent advances in multimodal large language models unlock unprecedented opportunities for GUI automation. However, a fundamental challenge remains: how to efficiently acquire high-quality training data while maintaining annotation reliability? We introduce a self-evolving training pipeline powered by the Calibrated Step Reward System, which converts model-generated trajectories into reliable training signals through trajectory-level calibration, achieving >90% annotation accuracy with 10-100x lower cost. Leveraging this pipeline, we introduce Step-GUI, a family of models (4B/8B) that achieves state-of-the-art GUI performance (8B: 80.2% AndroidWorld, 48.5% OSWorld, 62.6% ScreenShot-Pro) while maintaining robust general capabilities. As GUI agent capabilities improve, practical deployment demands standardized interfaces across heterogeneous devices while protecting user privacy. To this end, we propose GUI-MCP, the first Model Context Protocol for GUI automation with hierarchical architecture that combines low-level atomic operations and high-level task delegation to local specialist models, enabling high-privacy execution where sensitive data stays on-device. Finally, to assess whether agents can handle authentic everyday usage, we introduce AndroidDaily, a benchmark grounded in real-world mobile usage patterns with 3146 static actions and 235 end-to-end tasks across high-frequency daily scenarios (8B: static 89.91%, end-to-end 52.50%). Our work advances the development of practical GUI agents and demonstrates strong potential for real-world deployment in everyday digital interactions.
Abstract:With the rapid deployments of 5G and 6G networks, accurate modeling of urban radio propagation has become critical for system design and network planning. However, conventional statistical or empirical models fail to fully capture the influence of detailed geometric features on site-specific channel variances in dense urban environments. In this paper, we propose a geometry map-based propagation channel model that directly extracts key parameters from a 3D geometry map and incorporates the Uniform Theory of Diffraction (UTD) to recursively compute multiple diffraction fields, thereby enabling accurate prediction of site-specific large-scale path loss and time-varying Doppler characteristics in urban scenarios. A well-designed identification algorithm is developed to efficiently detect buildings that significantly affect signal propagation. The proposed model is validated using urban measurement data, showing excellent agreement of path loss in both line-of-sight (LOS) and nonline-of-sight (NLOS) conditions. In particular, for NLOS scenarios with complex diffractions, it outperforms the 3GPP and simplified models, reducing the RMSE by 7.1 dB and 3.18 dB, respectively. Doppler analysis further demonstrates its accuracy in capturing time-varying propagation characteristics, confirming the scalability and generalization of the model in urban environments.




Abstract:Integrated Sensing and Communication (ISAC) technology plays a critical role in future intelligent transportation systems, by enabling vehicles to perceive and reconstruct the surrounding environment through reuse of wireless signals, thereby reducing or even eliminating the need for additional sensors such as LiDAR or radar. However, existing ISAC based reconstruction methods often lack the ability to track dynamic scenes with sufficient accuracy and temporal consistency, limiting the real world applicability. To address this limitation, we propose a deep learning based framework for vehicular environment reconstruction by using ISAC channels. We first establish a joint channel environment dataset based on multi modal measurements from real world urban street scenarios. Then, a multistage deep learning network is developed to reconstruct the environment. Specifically, a scene decoder identifies the environmental context such as buildings, trees and so on; a cluster center decoder predicts coarse spatial layouts by localizing dominant scattering centers; a point cloud decoder recovers fine grained geometry and structure of surrounding environments. Experimental results demonstrate that the proposed method achieves high-quality dynamic environment reconstruction with a Chamfer Distance of 0.29 and F Score@1% of 0.87. In addition, complexity analysis demonstrates the efficiency and practical applicability of the method in real time scenarios. This work provides a pathway toward low cost environment reconstruction based on ISAC for future intelligent transportation.
Abstract:With the advancement of sixth-generation (6G) wireless communication systems, integrated sensing and communication (ISAC) is crucial for perceiving and interacting with the environment via electromagnetic propagation, termed channel semantics, to support tasks like decision-making. However, channel models focusing on physical characteristics face challenges in representing semantics embedded in the channel, thereby limiting the evaluation of ISAC systems. To tackle this, we present a novel framework for channel modeling from the conceptual event perspective. By leveraging a multi-level semantic structure and characterized knowledge libraries, the framework decomposes complex channel characteristics into extensible semantic characterization, thereby better capturing the relationship between environment and channel, and enabling more flexible adjustments of channel models for different events without requiring a complete reset. Specifically, we define channel semantics on three levels: status semantics, behavior semantics, and event semantics, corresponding to channel multipaths, channel time-varying trajectories, and channel topology, respectively. Taking realistic vehicular ISAC scenarios as an example, we perform semantic clustering, characterizing status semantics via multipath statistical distributions, modeling behavior semantics using Markov chains for time variation, and representing event semantics through a co-occurrence matrix. Results show the model accurately generates channels while capturing rich semantic information. Moreover, its generalization supports customized semantics.