With the rapid proliferation of smart mobile devices, federated learning (FL) has been widely considered for application in wireless networks for distributed model training. However, data heterogeneity, e.g., non-independently identically distributions and different sizes of training data among clients, poses major challenges to wireless FL. Limited communication resources complicate the implementation of fair scheduling which is required for training on heterogeneous data, and further deteriorate the overall performance. To address this issue, this paper focuses on performance analysis and optimization for wireless FL, considering data heterogeneity, combined with wireless resource allocation. Specifically, we first develop a closed-form expression for an upper bound on the FL loss function, with a particular emphasis on data heterogeneity described by a dataset size vector and a data divergence vector. Then we formulate the loss function minimization problem, under constraints on long-term energy consumption and latency, and jointly optimize client scheduling, resource allocation, and the number of local training epochs (CRE). Next, via the Lyapunov drift technique, we transform the CRE optimization problem into a series of tractable problems. Extensive experiments on real-world datasets demonstrate that the proposed algorithm outperforms other benchmarks in terms of the learning accuracy and energy consumption.
In this paper, a communication-efficient federated learning (FL) framework is proposed for improving the convergence rate of FL under a limited uplink capacity. The central idea of the proposed framework is to transmit the values and positions of the top-$S$ entries of a local model update for uplink transmission. A lossless encoding technique is considered for transmitting the positions of these entries, while a linear transformation followed by the Lloyd-Max scalar quantization is considered for transmitting their values. For an accurate reconstruction of the top-$S$ values, a linear minimum mean squared error method is developed based on the Bussgang decomposition. Moreover, an error feedback strategy is introduced to compensate for both compression and reconstruction errors. The convergence rate of the proposed framework is analyzed for a non-convex loss function with consideration of the compression and reconstruction errors. From the analytical result, the key parameters of the proposed framework are optimized for maximizing the convergence rate for the given capacity. Simulation results on the MNIST and CIFAR-10 datasets demonstrate that the proposed framework outperforms state-of-the-art FL frameworks in terms of classification accuracy under the limited uplink capacity.
Extremely large-scale multiple-input-multiple-output (XL-MIMO), which offers vast spatial degrees of freedom, has emerged as a potentially pivotal enabling technology for the sixth generation (6G) of wireless mobile networks. With its growing significance, both opportunities and challenges are concurrently manifesting. This paper presents a comprehensive survey of research on XL-MIMO wireless systems. In particular, we introduce four XL-MIMO hardware architectures: uniform linear array (ULA)-based XL-MIMO, uniform planar array (UPA)-based XL-MIMO utilizing either patch antennas or point antennas, and continuous aperture (CAP)-based XL-MIMO. We comprehensively analyze and discuss their characteristics and interrelationships. Following this, we examine exact and approximate near-field channel models for XL-MIMO. Given the distinct electromagnetic properties of near-field communications, we present a range of channel models to demonstrate the benefits of XL-MIMO. We further motivate and discuss low-complexity signal processing schemes to promote the practical implementation of XL-MIMO. Furthermore, we explore the interplay between XL-MIMO and other emergent 6G technologies. Finally, we outline several compelling research directions for future XL-MIMO wireless communication systems.
The development of applications based on artificial intelligence and implemented over wireless networks is increasingly rapidly and is expected to grow dramatically in the future. The resulting demand for the aggregation of large amounts of data has caused serious communication bottlenecks in wireless networks and particularly at the network edge. Over-the-air federated learning (OTA-FL), leveraging the superposition feature of multi-access channels (MACs), enables users at the network edge to share spectrum resources and achieves efficient and low-latency global model aggregation. This paper provides a holistic review of progress in OTA-FL and points to potential future research directions. Specifically, we classify OTA-FL from the perspective of system settings, including single-antenna OTA-FL, multi-antenna OTA-FL, and OTA-FL with the aid of the emerging reconfigurable intelligent surface (RIS) technology, and the contributions of existing works in these areas are summarized. Moreover, we discuss the trust, security and privacy aspects of OTA-FL, and highlight concerns arising from security and privacy. Finally, challenges and potential research directions are discussed to promote the future development of OTA-FL in terms of improving system performance, reliability, and trustworthiness. Specifical challenges to be addressed include model distortion under channel fading, the ineffective OTA aggregation of local models trained on substantially unbalanced data, and the limited accessibility and verifiability of individual local models.
The dependence on training data of the Gibbs algorithm (GA) is analytically characterized. By adopting the expected empirical risk as the performance metric, the sensitivity of the GA is obtained in closed form. In this case, sensitivity is the performance difference with respect to an arbitrary alternative algorithm. This description enables the development of explicit expressions involving the training errors and test errors of GAs trained with different datasets. Using these tools, dataset aggregation is studied and different figures of merit to evaluate the generalization capabilities of GAs are introduced. For particular sizes of such datasets and parameters of the GAs, a connection between Jeffrey's divergence, training and test errors is established.
As a promising technique for high-mobility wireless communications, orthogonal time frequency space (OTFS) has been proved to enjoy excellent advantages with respect to traditional orthogonal frequency division multiplexing (OFDM). Although multiple studies have considered index modulation (IM) based OTFS (IM-OTFS) schemes to further improve system performance, a challenging and open problem is the development of effective IM schemes and efficient receivers for practical OTFS systems that must operate in the presence of channel delays and Doppler shifts. In this paper, we propose two novel block-wise IM schemes for OTFS systems, named delay-IM with OTFS (DeIM-OTFS) and Doppler-IM with OTFS (DoIM-OTFS), where a block of delay/Doppler resource bins are activated simultaneously. Based on a maximum likelihood (ML) detector, we analyze upper bounds on the average bit error rates for the proposed DeIM-OTFS and DoIM-OTFS schemes, and verify their performance advantages over the existing IM-OTFS systems. We also develop a multi-layer joint symbol and activation pattern detection (MLJSAPD) algorithm and a customized message passing detection (CMPD) algorithm for our proposed DeIMOTFS and DoIM-OTFS systems with low complexity. Simulation results demonstrate that our proposed MLJSAPD and CMPD algorithms can achieve desired performance with robustness to the imperfect channel state information (CSI).
Semantic-aware communication is a novel paradigm that draws inspiration from human communication focusing on the delivery of the meaning of messages. It has attracted significant interest recently due to its potential to improve the efficiency and reliability of communication and enhance users' QoE. Most existing works focus on transmitting and delivering the explicit semantic meaning that can be directly identified from the source signal. This paper investigates the implicit semantic-aware communication in which the hidden information that cannot be directly observed from the source signal must be recognized and interpreted by the intended users. To this end, a novel implicit semantic-aware communication (iSAC) architecture is proposed for representing, communicating, and interpreting the implicit semantic meaning between source and destination users. A projection-based semantic encoder is proposed to convert the high-dimensional graphical representation of explicit semantics into a low-dimensional semantic constellation space for efficient physical channel transmission. To enable the destination user to learn and imitate the implicit semantic reasoning process of source user, a generative adversarial imitation learning-based solution, called G-RML, is proposed. Different from existing communication solutions, the source user in G-RML does not focus only on sending as much of the useful messages as possible; but, instead, it tries to guide the destination user to learn a reasoning mechanism to map any observed explicit semantics to the corresponding implicit semantics that are most relevant to the semantic meaning. Compared to the existing solutions, our proposed G-RML requires much less communication and computational resources and scales well to the scenarios involving the communication of rich semantic meanings consisting of a large number of concepts and relations.
The beam squint effect, which manifests in different steering matrices in different sub-bands, has been widely considered a challenge in millimeter wave (mmWave) multiinput multi-output (MIMO) channel estimation. Existing methods either require specific forms of the precoding/combining matrix, which restrict their general practicality, or simply ignore the beam squint effect by only making use of a single sub-band for channel estimation. Recognizing that different steering matrices are coupled by the same set of unknown channel parameters, this paper proposes to exploit the common sparsity structure of the virtual channel model so that signals from different subbands can be jointly utilized to enhance the performance of channel estimation. A probabilistic model is built to induce the common sparsity in the spatial domain, and the first-order Taylor expansion is adopted to get rid of the grid mismatch in the dictionaries. To learn the model parameters, a variational expectation-maximization (EM) algorithm is derived, which automatically obtains the balance between the likelihood function and the common sparsity prior information, and is applicable to arbitrary forms of precoding/combining matrices. Simulation results show the superior estimation accuracy of the proposed algorithm over existing methods under different noise powers and system configurations.
We propose a novel privacy-preserving uplink over-the-air computation (AirComp) method, termed FLORAS, for single-input single-output (SISO) wireless federated learning (FL) systems. From the communication design perspective, FLORAS eliminates the requirement of channel state information at the transmitters (CSIT) by leveraging the properties of orthogonal sequences. From the privacy perspective, we prove that FLORAS can offer both item-level and client-level differential privacy (DP) guarantees. Moreover, by adjusting the system parameters, FLORAS can flexibly achieve different DP levels at no additional cost. A novel FL convergence bound is derived which, combined with the privacy guarantees, allows for a smooth tradeoff between convergence rate and differential privacy levels. Numerical results demonstrate the advantages of FLORAS compared with the baseline AirComp method, and validate that our analytical results can guide the design of privacy-preserving FL with different tradeoff requirements on the model convergence and privacy levels.
The effect of the relative entropy asymmetry is analyzed in the empirical risk minimization with relative entropy regularization (ERM-RER) problem. A novel regularization is introduced, coined Type-II regularization, that allows for solutions to the ERM-RER problem with a support that extends outside the support of the reference measure. The solution to the new ERM-RER Type-II problem is analytically characterized in terms of the Radon-Nikodym derivative of the reference measure with respect to the solution. The analysis of the solution unveils the following properties of relative entropy when it acts as a regularizer in the ERM-RER problem: i) relative entropy forces the support of the Type-II solution to collapse into the support of the reference measure, which introduces a strong inductive bias that dominates the evidence provided by the training data; ii) Type-II regularization is equivalent to classical relative entropy regularization with an appropriate transformation of the empirical risk function. Closed-form expressions of the expected empirical risk as a function of the regularization parameters are provided.