Viewport prediction is the crucial task for adaptive 360-degree video streaming, as the bitrate control algorithms usually require the knowledge of the user's viewing portions of the frames. Various methods are studied and adopted for viewport prediction from less accurate statistic tools to highly calibrated deep neural networks. Conventionally, it is difficult to implement sophisticated deep learning methods on mobile devices, which have limited computation capability. In this work, we propose an advanced learning-based viewport prediction approach and carefully design it to introduce minimal transmission and computation overhead for mobile terminals. We also propose a model-agnostic meta-learning (MAML) based saliency prediction network trainer, which provides a few-sample fast training solution to obtain the prediction model by utilizing the information from the past models. We further discuss how to integrate this mobile-friendly viewport prediction (MFVP) approach into a typical 360-degree video live streaming system by formulating and solving the bitrate adaptation problem. Extensive experiment results show that our prediction approach can work in real-time for live video streaming and can achieve higher accuracies compared to other existing prediction methods on mobile end, which, together with our bitrate adaptation algorithm, significantly improves the streaming QoE from various aspects. We observe the accuracy of MFVP is 8.1$\%$ to 28.7$\%$ higher than other algorithms and achieves 3.73$\%$ to 14.96$\%$ higher average quality level and 49.6$\%$ to 74.97$\%$ less quality level change than other algorithms.
Recently, federated learning (FL) has gained momentum because of its capability in preserving data privacy. To conduct model training by FL, multiple clients exchange model updates with a parameter server via Internet. To accelerate the communication speed, it has been explored to deploy a programmable switch (PS) in lieu of the parameter server to coordinate clients. The challenge to deploy the PS in FL lies in its scarce memory space, prohibiting running memory consuming aggregation algorithms on the PS. To overcome this challenge, we propose Federated Learning in-network Aggregation with Compression (FediAC) algorithm, consisting of two phases: client voting and model aggregating. In the former phase, clients report their significant model update indices to the PS to estimate global significant model updates. In the latter phase, clients upload global significant model updates to the PS for aggregation. FediAC consumes much less memory space and communication traffic than existing works because the first phase can guarantee consensus compression across clients. The PS easily aligns model update indices to swiftly complete aggregation in the second phase. Finally, we conduct extensive experiments by using public datasets to demonstrate that FediAC remarkably surpasses the state-of-the-art baselines in terms of model accuracy and communication traffic.
In Federated Learning (FL) paradigm, a parameter server (PS) concurrently communicates with distributed participating clients for model collection, update aggregation, and model distribution over multiple rounds, without touching private data owned by individual clients. FL is appealing in preserving data privacy; yet the communication between the PS and scattered clients can be a severe bottleneck. Model compression algorithms, such as quantization and sparsification, have been suggested but they generally assume a fixed code length, which does not reflect the heterogeneity and variability of model updates. In this paper, through both analysis and experiments, we show strong evidences that variable-length is beneficial for compression in FL. We accordingly present Fed-CVLC (Federated Learning Compression with Variable-Length Codes), which fine-tunes the code length in response of the dynamics of model updates. We develop optimal tuning strategy that minimizes the loss function (equivalent to maximizing the model utility) subject to the budget for communication. We further demonstrate that Fed-CVLC is indeed a general compression design that bridges quantization and sparsification, with greater flexibility. Extensive experiments have been conducted with public datasets to demonstrate that Fed-CVLC remarkably outperforms state-of-the-art baselines, improving model utility by 1.50%-5.44%, or shrinking communication traffic by 16.67%-41.61%.
Distributed machine learning (DML) in mobile environments faces significant communication bottlenecks. Gradient compression has emerged as an effective solution to this issue, offering substantial benefits in environments with limited bandwidth and metered data. Yet, they encounter severe performance drop in non-IID environments due to a one-size-fits-all compression approach, which does not account for the varying data volumes across workers. Assigning varying compression ratios to workers with distinct data distributions and volumes is thus a promising solution. This study introduces an analysis of distributed SGD with non-uniform compression, which reveals that the convergence rate (indicative of the iterations needed to achieve a certain accuracy) is influenced by compression ratios applied to workers with differing volumes. Accordingly, we frame relative compression ratio assignment as an $n$-variables chi-square nonlinear optimization problem, constrained by a fixed and limited communication budget. We propose DAGC-R, which assigns the worker handling larger data volumes the conservative compression. Recognizing the computational limitations of mobile devices, we DAGC-A, which are computationally less demanding and enhances the robustness of the absolute gradient compressor in non-IID scenarios. Our experiments confirm that both the DAGC-A and DAGC-R can achieve better performance when dealing with highly imbalanced data volume distribution and restricted communication.
Recently, federated learning (FL) has received intensive research because of its ability in preserving data privacy for scattered clients to collaboratively train machine learning models. Commonly, a parameter server (PS) is deployed for aggregating model parameters contributed by different clients. Decentralized federated learning (DFL) is upgraded from FL which allows clients to aggregate model parameters with their neighbours directly. DFL is particularly feasible for vehicular networks as vehicles communicate with each other in a vehicle-to-vehicle (V2V) manner. However, due to the restrictions of vehicle routes and communication distances, it is hard for individual vehicles to sufficiently exchange models with others. Data sources contributing to models on individual vehicles may not diversified enough resulting in poor model accuracy. To address this problem, we propose the DFL-DDS (DFL with diversified Data Sources) algorithm to diversify data sources in DFL. Specifically, each vehicle maintains a state vector to record the contribution weight of each data source to its model. The Kullback-Leibler (KL) divergence is adopted to measure the diversity of a state vector. To boost the convergence of DFL, a vehicle tunes the aggregation weight of each data source by minimizing the KL divergence of its state vector, and its effectiveness in diversifying data sources can be theoretically proved. Finally, the superiority of DFL-DDS is evaluated by extensive experiments (with MNIST and CIFAR-10 datasets) which demonstrate that DFL-DDS can accelerate the convergence of DFL and improve the model accuracy significantly compared with state-of-the-art baselines.
Recently, blockchain-based federated learning (BFL) has attracted intensive research attention due to that the training process is auditable and the architecture is serverless avoiding the single point failure of the parameter server in vanilla federated learning (VFL). Nevertheless, BFL tremendously escalates the communication traffic volume because all local model updates (i.e., changes of model parameters) obtained by BFL clients will be transmitted to all miners for verification and to all clients for aggregation. In contrast, the parameter server and clients in VFL only retain aggregated model updates. Consequently, the huge communication traffic in BFL will inevitably impair the training efficiency and hinder the deployment of BFL in reality. To improve the practicality of BFL, we are among the first to propose a fast blockchain-based communication-efficient federated learning framework by compressing communications in BFL, called BCFL. Meanwhile, we derive the convergence rate of BCFL with non-convex loss. To maximize the final model accuracy, we further formulate the problem to minimize the training loss of the convergence rate subject to a limited training time with respect to the compression rate and the block generation rate, which is a bi-convex optimization problem and can be efficiently solved. To the end, to demonstrate the efficiency of BCFL, we carry out extensive experiments with standard CIFAR-10 and FEMNIST datasets. Our experimental results not only verify the correctness of our analysis, but also manifest that BCFL can remarkably reduce the communication traffic by 95-98% or shorten the training time by 90-95% compared with BFL.
Convolutional neural network (CNN) and Transformer have achieved great success in multimedia applications. However, little effort has been made to effectively and efficiently harmonize these two architectures to satisfy image deraining. This paper aims to unify these two architectures to take advantage of their learning merits for image deraining. In particular, the local connectivity and translation equivariance of CNN and the global aggregation ability of self-attention (SA) in Transformer are fully exploited for specific local context and global structure representations. Based on the observation that rain distribution reveals the degradation location and degree, we introduce degradation prior to help background recovery and accordingly present the association refinement deraining scheme. A novel multi-input attention module (MAM) is proposed to associate rain perturbation removal and background recovery. Moreover, we equip our model with effective depth-wise separable convolutions to learn the specific feature representations and trade off computational complexity. Extensive experiments show that our proposed method (dubbed as ELF) outperforms the state-of-the-art approach (MPRNet) by 0.25 dB on average, but only accounts for 11.7\% and 42.1\% of its computational cost and parameters. The source code is available at https://github.com/kuijiang94/Magic-ELF.
Federated Learning (FL) incurs high communication overhead, which can be greatly alleviated by compression for model updates. Yet the tradeoff between compression and model accuracy in the networked environment remains unclear and, for simplicity, most implementations adopt a fixed compression rate only. In this paper, we for the first time systematically examine this tradeoff, identifying the influence of the compression error on the final model accuracy with respect to the learning rate. Specifically, we factor the compression error of each global iteration into the convergence rate analysis under both strongly convex and non-convex loss functions. We then present an adaptation framework to maximize the final model accuracy by strategically adjusting the compression rate in each iteration. We have discussed the key implementation issues of our framework in practical networks with representative compression algorithms. Experiments over the popular MNIST and CIFAR-10 datasets confirm that our solution effectively reduces network traffic yet maintains high model accuracy in FL.
Federated Learning (FL) is an emerging decentralized learning framework through which multiple clients can collaboratively train a learning model. However, a major obstacle that impedes the wide deployment of FL lies in massive communication traffic. To train high dimensional machine learning models (such as CNN models), heavy communication traffic can be incurred by exchanging model updates via the Internet between clients and the parameter server (PS), implying that the network resource can be easily exhausted. Compressing model updates is an effective way to reduce the traffic amount. However, a flexible unbiased compression algorithm applicable for both uplink and downlink compression in FL is still absent from existing works. In this work, we devise the Model Update Compression by Soft Clustering (MUCSC) algorithm to compress model updates transmitted between clients and the PS. In MUCSC, it is only necessary to transmit cluster centroids and the cluster ID of each model update. Moreover, we prove that: 1) The compressed model updates are unbiased estimation of their original values so that the convergence rate by transmitting compressed model updates is unchanged; 2) MUCSC can guarantee that the influence of the compression error on the model accuracy is minimized. Then, we further propose the boosted MUCSC (B-MUCSC) algorithm, a biased compression algorithm that can achieve an extremely high compression rate by grouping insignificant model updates into a super cluster. B-MUCSC is suitable for scenarios with very scarce network resource. Ultimately, we conduct extensive experiments with the CIFAR-10 and FEMNIST datasets to demonstrate that our algorithms can not only substantially reduce the volume of communication traffic in FL, but also improve the training efficiency in practical networks.