In this paper, we have expanded the current status of semantic communication limited to processing one task to a more general system that can handle multiple tasks concurrently. In pursuit of this, we first introduced our definition of the "semantic source", enabling the interpretation of multiple semantics based on a single observation. A semantic encoder design is then introduced, featuring the division of the encoder into a common unit and multiple specific units enabling cooperative multi-task processing. Simulation results demonstrate the effectiveness of the proposed semantic source and the system design. Our approach employs information maximization (infomax) and end-to-end design principles.
Low Earth Orbit (LEO) satellite-to-handheld connections herald a new era in satellite communications. Space-Division Multiple Access (SDMA) precoding is a method that mitigates interference among satellite beams, boosting spectral efficiency. While optimal SDMA precoding solutions have been proposed for ideal channel knowledge in various scenarios, addressing robust precoding with imperfect channel information has primarily been limited to simplified models. However, these models might not capture the complexity of LEO satellite applications. We use the Soft Actor-Critic (SAC) deep Reinforcement Learning (RL) method to learn robust precoding strategies without the need for explicit insights into the system conditions and imperfections. Our results show flexibility to adapt to arbitrary system configurations while performing strongly in terms of achievable rate and robustness to disruptive influences compared to analytical benchmark precoders.
Mega-constellations of small satellites have evolved into a source of massive amount of valuable data. To manage this data efficiently, on-board federated learning (FL) enables satellites to train a machine learning (ML) model collaboratively without having to share the raw data. This paper introduces a scheme for scheduling on-board FL for constellations connected with intra-orbit inter-satellite links. The proposed scheme utilizes the predictable visibility pattern between satellites and ground station (GS), both at the individual satellite level and cumulatively within the entire orbit, to mitigate intermittent connectivity and best use of available time. To this end, two distinct schedulers are employed: one for coordinating the FL procedures among orbits, and the other for controlling those within each orbit. These two schedulers cooperatively determine the appropriate time to perform global updates in GS and then allocate suitable duration to satellites within each orbit for local training, proportional to usable time until next global update. This scheme leads to improved test accuracy within a shorter time.
Motivated by the recent success of Machine Learning tools in wireless communications, the idea of semantic communication by Weaver from 1949 has gained attention. It breaks with Shannon's classic design paradigm by aiming to transmit the meaning, i.e., semantics, of a message instead of its exact version, allowing for information rate savings. In this work, we apply the Stochastic Policy Gradient (SPG) to design a semantic communication system by reinforcement learning, not requiring a known or differentiable channel model - a crucial step towards deployment in practice. Further, we motivate the use of SPG for both classic and semantic communication from the maximization of the mutual information between received and target variables. Numerical results show that our approach achieves comparable performance to a model-aware approach based on the reparametrization trick, albeit with a decreased convergence rate.
With increasing complexity of modern communication systems, machine learning algorithms have become a focal point of research. However, performance demands have tightened in parallel to complexity. For some of the key applications targeted by future wireless, such as the medical field, strict and reliable performance guarantees are essential, but vanilla machine learning methods have been shown to struggle with these types of requirements. Therefore, the question is raised whether these methods can be extended to better deal with the demands imposed by such applications. In this paper, we look at a combinatorial resource allocation challenge with rare, significant events which must be handled properly. We propose to treat this as a multi-task learning problem, select two methods from this domain, Elastic Weight Consolidation and Gradient Episodic Memory, and integrate them into a vanilla actor-critic scheduler. We compare their performance in dealing with Black Swan Events with the state-of-the-art of augmenting the training data distribution and report that the multi-task approach proves highly effective.
The quality of data driven learning algorithms scales significantly with the quality of data available. One of the most straight-forward ways to generate good data is to sample or explore the data source intelligently. Smart sampling can reduce the cost of gaining samples, reduce computation cost in learning, and enable the learning algorithm to adapt to unforeseen events. In this paper, we teach three Deep Q-Networks (DQN) with different exploration strategies to solve a problem of puncturing ongoing transmissions for URLLC messages. We demonstrate the efficiency of two adaptive exploration candidates, variance-based and Maximum Entropy-based exploration, compared to the standard, simple epsilon-greedy exploration approach.
Questions remain on the robustness of data-driven learning methods when crossing the gap from simulation to reality. We utilize weight anchoring, a method known from continual learning, to cultivate and fixate desired behavior in Neural Networks. Weight anchoring may be used to find a solution to a learning problem that is nearby the solution of another learning problem. Thereby, learning can be carried out in optimal environments without neglecting or unlearning desired behavior. We demonstrate this approach on the example of learning mixed QoS-efficient discrete resource scheduling with infrequent priority messages. Results show that this method provides performance comparable to the state of the art of augmenting a simulation environment, alongside significantly increased robustness and steerability.
Advances in mobile communication capabilities open the door for closer integration of pre-hospital and in-hospital care processes. For example, medical specialists can be enabled to guide on-site paramedics and can, in turn, be supplied with live vitals or visuals. Consolidating such performance-critical applications with the highly complex workings of mobile communications requires solutions both reliable and efficient, yet easy to integrate with existing systems. This paper explores the application of Deep Deterministic Policy Gradient~(\ddpg) methods for learning a communications resource scheduling algorithm with special regards to priority users. Unlike the popular Deep-Q-Network methods, the \ddpg is able to produce continuous-valued output. With light post-processing, the resulting scheduler is able to achieve high performance on a flexible sum-utility goal.
Direct Low Earth Orbit satellite-to-handheld links are expected to be part of a new era in satellite communications. Space-Division Multiple Access precoding is a technique that reduces interference among satellite beams, therefore increasing spectral efficiency by allowing cooperating satellites to reuse frequency. Over the past decades, optimal precoding solutions with perfect channel state information have been proposed for several scenarios, whereas robust precoding with only imperfect channel state information has been mostly studied for simplified models. In particular, for Low Earth Orbit satellite applications such simplified models might not be accurate. In this paper, we use the function approximation capabilities of the Soft Actor-Critic deep Reinforcement Learning algorithm to learn robust precoding with no knowledge of the system imperfections.