Abstract:Reinforcement learning (RL) in non-stationary environments is challenging, as changing dynamics and rewards quickly make past experiences outdated. Traditional experience replay (ER) methods, especially those using TD-error prioritization, struggle to distinguish between changes caused by the agent's policy and those from the environment, resulting in inefficient learning under dynamic conditions. To address this challenge, we propose the Discrepancy of Environment Dynamics (DoE), a metric that isolates the effects of environment shifts on value functions. Building on this, we introduce Discrepancy of Environment Prioritized Experience Replay (DEER), an adaptive ER framework that prioritizes transitions based on both policy updates and environmental changes. DEER uses a binary classifier to detect environment changes and applies distinct prioritization strategies before and after each shift, enabling more sample-efficient learning. Experiments on four non-stationary benchmarks demonstrate that DEER further improves the performance of off-policy algorithms by 11.54 percent compared to the best-performing state-of-the-art ER methods.
Abstract:Multi-agent reinforcement learning (MARL) holds substantial promise for intelligent decision-making in complex environments. However, it suffers from a coordination and scalability bottleneck as the number of agents increases. To address these issues, we propose the LLM-empowered expert demonstrations framework for multi-agent reinforcement learning (LEED). LEED consists of two components: a demonstration generation (DG) module and a policy optimization (PO) module. Specifically, the DG module leverages large language models to generate instructions for interacting with the environment, thereby producing high-quality demonstrations. The PO module adopts a decentralized training paradigm, where each agent utilizes the generated demonstrations to construct an expert policy loss, which is then integrated with its own policy loss. This enables each agent to effectively personalize and optimize its local policy based on both expert knowledge and individual experience. Experimental results show that LEED achieves superior sample efficiency, time efficiency, and robust scalability compared to state-of-the-art baselines.
Abstract:Optical Wireless Communication (OWC) has gained significant attention due to its high-speed data transmission and throughput. Optical wireless channels are often assumed to be flat, but we evaluate frequency selective channels to consider high data rate optical wireless or very dispersive environments. To address this for optical scenarios, this paper presents a robust channel estimation framework with low-complexity to mitigate frequency-selective effects, then to improve system reliability and performance. This channel estimation framework contains a neural network that can estimate general optical wireless channels without prior channel information about the environment. Based on this estimate and the corresponding delay spread, one of several candidate offline-trained neural networks will be activated to predict this channel. Simulation results demonstrate that the proposed method has improved and robust normalized mean square error (NMSE) and bit error rate (BER) performance compared to conventional estimation methods while maintaining computational efficiency. These findings highlight the potential of neural network solutions in enhancing the performance of OWC systems under indoor channel conditions.
Abstract:In this paper, we propose an encoder-decoder neural architecture (called Channelformer) to achieve improved channel estimation for orthogonal frequency-division multiplexing (OFDM) waveforms in downlink scenarios. The self-attention mechanism is employed to achieve input precoding for the input features before processing them in the decoder. In particular, we implement multi-head attention in the encoder and a residual convolutional neural architecture as the decoder, respectively. We also employ a customized weight-level pruning to slim the trained neural network with a fine-tuning process, which reduces the computational complexity significantly to realize a low complexity and low latency solution. This enables reductions of up to 70\% in the parameters, while maintaining an almost identical performance compared with the complete Channelformer. We also propose an effective online training method based on the fifth generation (5G) new radio (NR) configuration for the modern communication systems, which only needs the available information at the receiver for online training. Using industrial standard channel models, the simulations of attention-based solutions show superior estimation performance compared with other candidate neural network methods for channel estimation.
Abstract:In this paper, we propose a method to design the training data that can support robust generalization of trained neural networks to unseen channels. The proposed design that improves the generalization is described and analysed. It avoids the requirement of online training for previously unseen channels, as this is a memory and processing intensive solution, especially for battery powered mobile terminals. To prove the validity of the proposed method, we use the channels modelled by different standards and fading modelling for simulation. We also use an attention-based structure and a convolutional neural network to evaluate the generalization results achieved. Simulation results show that the trained neural networks maintain almost identical performance on the unseen channels.
Abstract:In this paper, we deploy the self-attention mechanism to achieve improved channel estimation for orthogonal frequency-division multiplexing waveforms in the downlink. Specifically, we propose a new hybrid encoder-decoder structure (called HA02) for the first time which exploits the attention mechanism to focus on the most important input information. In particular, we implement a transformer encoder block as the encoder to achieve the sparsity in the input features and a residual neural network as the decoder respectively, inspired by the success of the attention mechanism. Using 3GPP channel models, our simulations show superior estimation performance compared with other candidate neural network methods for channel estimation.
Abstract:Research on machine learning for channel estimation, especially neural network solutions for wireless communications, is attracting significant current interest. This is because conventional methods cannot meet the present demands of the high speed communication. In the paper, we deploy a general residual convolutional neural network to achieve channel estimation for the orthogonal frequency-division multiplexing (OFDM) signals in a downlink scenario. Our method also deploys a simple interpolation layer to replace the transposed convolutional layer used in other networks to reduce the computation cost. The proposed method is more easily adapted to different pilot patterns and packet sizes. Compared with other deep learning methods for channel estimation, our results for 3GPP channel models suggest improved mean squared error performance for our approach.