Terahertz (THz) communications are envisioned as a key technology for sixth generation (6G) wireless systems. The study of underlying THz wireless propagation channels provides the foundations for the development of reliable THz communication systems and their applications. This article provides a comprehensive overview of the study of THz wireless channels. First, the three most popular THz channel measurement methodologies, namely, frequency-domain channel measurement based on a vector network analyzer (VNA), time-domain channel measurement based on sliding correlation, and time-domain channel measurement based on THz pulses from time-domain spectroscopy (THz-TDS), are introduced and compared. Current channel measurement systems and measurement campaigns are reviewed. Then, existing channel modeling methodologies are categorized into deterministic, stochastic, and hybrid approaches. State-of-the-art THz channel models are analyzed, and the channel simulators that are based on them are introduced. Next, an in-depth review of channel characteristics in the THz band is presented. Finally, open problems and future research directions for research studies on THz wireless channels for 6G are elaborated.
As multipath components (MPCs) are experimentally observed to appear in clusters, cluster-based channel models have been focused in the wireless channel study. However, most of the MPC clustering algorithms for MIMO channels with delay and angle information of MPCs are based on the distance metric that quantifies the similarity of two MPCs and determines the preferred cluster shape, greatly impacting MPC clustering quality. In this paper, a general framework of Mahalanobis-distance metric is proposed for MPC clustering in MIMO channel analysis, without user-specified parameters. Remarkably, the popular multipath component distance (MCD) is proved to be a special case of the proposed distance metric framework. Furthermore, two machine learning algorithms, namely, weak-supervised Mahalanobis metric for clustering and supervised large margin nearest neighbor, are introduced to learn the distance metric. To evaluate the effectiveness, a modified channel model is proposed based on the 3GPP spatial channel model to generate clustered MPCs with delay and angular information, since the original 3GPP spatial channel model (SCM) is incapable to evaluate clustering quality. Experiment results show that the proposed distance metric can significantly improve the clustering quality of existing clustering algorithms, while the learning phase requires considerably limited efforts of labeling MPCs.
Packet routing is a fundamental problem in communication networks that decides how the packets are directed from their source nodes to their destination nodes through some intermediate nodes. With the increasing complexity of network topology and highly dynamic traffic demand, conventional model-based and rule-based routing schemes show significant limitations, due to the simplified and unrealistic model assumptions, and lack of flexibility and adaption. Adding intelligence to the network control is becoming a trend and the key to achieving high-efficiency network operation. In this paper, we develop a model-free and data-driven routing strategy by leveraging reinforcement learning (RL), where routers interact with the network and learn from the experience to make some good routing configurations for the future. Considering the graph nature of the network topology, we design a multi-agent RL framework in combination with Graph Neural Network (GNN), tailored to the routing problem. Three deployment paradigms, centralized, federated, and cooperated learning, are explored respectively. Simulation results demonstrate that our algorithm outperforms some existing benchmark algorithms in terms of packet transmission delay and affordable load.
Maintaining long-term exploration ability remains one of the challenges of deep reinforcement learning (DRL). In practice, the reward shaping-based approaches are leveraged to provide intrinsic rewards for the agent to incentivize motivation. However, most existing IRS modules rely on attendant models or additional memory to record and analyze learning procedures, which leads to high computational complexity and low robustness. Moreover, they overemphasize the influence of a single state on exploration, which cannot evaluate the exploration performance from a global perspective. To tackle the problem, state entropy-based methods are proposed to encourage the agent to visit the state space more equitably. However, the estimation error and sample complexity are prohibitive when handling environments with high-dimensional observation. In this paper, we introduce a novel metric entitled Jain's fairness index (JFI) to replace the entropy regularizer, which requires no additional models or memory. In particular, JFI overcomes the vanishing intrinsic rewards problem and can be generalized into arbitrary tasks. Furthermore, we use a variational auto-encoder (VAE) model to capture the life-long novelty of states. Finally, the global JFI score and local state novelty are combined to form a multimodal intrinsic reward, controlling the exploration extent more precisely. Finally, extensive simulation results demonstrate that our multimodal reward shaping (MMRS) method can achieve higher performance in contrast to other benchmark schemes.
Fine-grained location prediction on smart phones can be used to improve app/system performance. Application scenarios include video quality adaptation as a function of the 5G network quality at predicted user locations, and augmented reality apps that speed up content rendering based on predicted user locations. Such use cases require prediction error in the same range as the GPS error, and no existing works on location prediction can achieve this level of accuracy. We present a system for fine-grained location prediction (FGLP) of mobile users, based on GPS traces collected on the phones. FGLP has two components: a federated learning framework and a prediction model. The framework runs on the phones of the users and also on a server that coordinates learning from all users in the system. FGLP represents the user location data as relative points in an abstract 2D space, which enables learning across different physical spaces. The model merges Bidirectional Long Short-Term Memory (BiLSTM) and Convolutional Neural Networks (CNN), where BiLSTM learns the speed and direction of the mobile users, and CNN learns information such as user movement preferences. FGLP uses federated learning to protect user privacy and reduce bandwidth consumption. Our experimental results, using a dataset with over 600,000 users, demonstrate that FGLP outperforms baseline models in terms of prediction accuracy. We also demonstrate that FGLP works well in conjunction with transfer learning, which enables model reusability. Finally, benchmark results on several types of Android phones demonstrate FGLP's feasibility in real life.
Since the mapping relationship between definitized intra-interventional 2D X-ray and undefined pre-interventional 3D Computed Tomography(CT) is uncertain, auxiliary positioning devices or body markers, such as medical implants, are commonly used to determine this relationship. However, such approaches can not be widely used in clinical due to the complex realities. To determine the mapping relationship, and achieve a initializtion post estimation of human body without auxiliary equipment or markers, a cross-modal matching transformer network is proposed to matching 2D X-ray and 3D CT images directly. The proposed approach first deep learns skeletal features from 2D X-ray and 3D CT images. The features are then converted into 1D X-ray and CT representation vectors, which are combined using a multi-modal transformer. As a result, the well-trained network can directly predict the spatial correspondence between arbitrary 2D X-ray and 3D CT. The experimental results show that when combining our approach with the conventional approach, the achieved accuracy and speed can meet the basic clinical intervention needs, and it provides a new direction for intra-interventional registration.
In this paper, we investigate the problem of the inverse reinforcement learning (IRL), especially the beyond-demonstrator (BD) IRL. The BD-IRL aims to not only imitate the expert policy but also extrapolate BD policy based on finite demonstrations of the expert. Currently, most of the BD-IRL algorithms are two-stage, which first infer a reward function then learn the policy via reinforcement learning (RL). Because of the two separate procedures, the two-stage algorithms have high computation complexity and lack robustness. To overcome these flaw, we propose a BD-IRL framework entitled hybrid adversarial inverse reinforcement learning (HAIRL), which successfully integrates the imitation and exploration into one procedure. The simulation results show that the HAIRL is more efficient and robust when compared with other similar state-of-the-art (SOTA) algorithms.
User scheduling is a classical problem and key technology in wireless communication, which will still plays an important role in the prospective 6G. There are many sophisticated schedulers that are widely deployed in the base stations, such as Proportional Fairness (PF) and Round-Robin Fashion (RRF). It is known that the Opportunistic (OP) scheduling is the optimal scheduler for maximizing the average user data rate (AUDR) considering the full buffer traffic. But the optimal strategy achieving the highest fairness still remains largely unknown both in the full buffer traffic and the bursty traffic. In this work, we investigate the problem of fairness-oriented user scheduling, especially for the RBG allocation. We build a user scheduler using Multi-Agent Reinforcement Learning (MARL), which conducts distributional optimization to maximize the fairness of the communication system. The agents take the cross-layer information (e.g. RSRP, Buffer size) as state and the RBG allocation result as action, then explore the optimal solution following a well-defined reward function designed for maximizing fairness. Furthermore, we take the 5%-tile user data rate (5TUDR) as the key performance indicator (KPI) of fairness, and compare the performance of MARL scheduling with PF scheduling and RRF scheduling by conducting extensive simulations. And the simulation results show that the proposed MARL scheduling outperforms the traditional schedulers.