Abstract:In this paper, deceptive signal-assisted private split learning is investigated. In our model, several edge devices jointly perform collaborative training, and some eavesdroppers aim to collect the model and data information from devices. To prevent the eavesdroppers from collecting model and data information, a subset of devices can transmit deceptive signals. Therefore, it is necessary to determine the subset of devices used for deceptive signal transmission, the subset of model training devices, and the models assigned to each model training device. This problem is formulated as an optimization problem whose goal is to minimize the information leaked to eavesdroppers while meeting the model training energy consumption and delay constraints. To solve this problem, we propose a soft actor-critic deep reinforcement learning framework with intrinsic curiosity module and cross-attention (ICM-CA) that enables a centralized agent to determine the model training devices, the deceptive signal transmission devices, the transmit power, and sub-models assigned to each model training device without knowing the position and monitoring probability of eavesdroppers. The proposed method uses an ICM module to encourage the server to explore novel actions and states and a CA module to determine the importance of each historical state-action pair thus improving training efficiency. Simulation results demonstrate that the proposed method improves the convergence rate by up to 3x and reduces the information leaked to eavesdroppers by up to 13% compared to the traditional SAC algorithm.
Abstract:In this paper, a secure and communication-efficient clustered federated learning (CFL) design is proposed. In our model, several base stations (BSs) with heterogeneous task-handling capabilities and multiple users with non-independent and identically distributed (non-IID) data jointly perform CFL training incorporating differential privacy (DP) techniques. Since each BS can process only a subset of the learning tasks and has limited wireless resource blocks (RBs) to allocate to users for federated learning (FL) model parameter transmission, it is necessary to jointly optimize RB allocation and user scheduling for CFL performance optimization. Meanwhile, our considered CFL method requires devices to use their limited data and FL model information to determine their task identities, which may introduce additional communication overhead. We formulate an optimization problem whose goal is to minimize the training loss of all learning tasks while considering device clustering, RB allocation, DP noise, and FL model transmission delay. To solve the problem, we propose a novel dynamic penalty function assisted value decomposed multi-agent reinforcement learning (DPVD-MARL) algorithm that enables distributed BSs to independently determine their connected users, RBs, and DP noise of the connected users but jointly minimize the training loss of all learning tasks across all BSs. Different from the existing MARL methods that assign a large penalty for invalid actions, we propose a novel penalty assignment scheme that assigns penalty depending on the number of devices that cannot meet communication constraints (e.g., delay), which can guide the MARL scheme to quickly find valid actions, thus improving the convergence speed. Simulation results show that the DPVD-MARL can improve the convergence rate by up to 20% and the ultimate accumulated rewards by 15% compared to independent Q-learning.