In the realm of machine learning (ML) systems featuring client-host connections, the enhancement of privacy security can be effectively achieved through federated learning (FL) as a secure distributed ML methodology. FL effectively integrates cloud infrastructure to transfer ML models onto edge servers using blockchain technology. Through this mechanism, it guarantees the streamlined processing and data storage requirements of both centralized and decentralized systems, with an emphasis on scalability, privacy considerations, and cost-effective communication. In current FL implementations, data owners locally train their models, and subsequently upload the outcomes in the form of weights, gradients, and parameters to the cloud for overall model aggregation. This innovation obviates the necessity of engaging Internet of Things (IoT) clients and participants to communicate raw and potentially confidential data directly with a cloud center. This not only reduces the costs associated with communication networks but also enhances the protection of private data. This survey conducts an analysis and comparison of recent FL applications, aiming to assess their efficiency, accuracy, and privacy protection. However, in light of the complex and evolving nature of FL, it becomes evident that additional research is imperative to address lingering knowledge gaps and effectively confront the forthcoming challenges in this field. In this study, we categorize recent literature into the following clusters: privacy protection, resource allocation, case study analysis, and applications. Furthermore, at the end of each section, we tabulate the open areas and future directions presented in the referenced literature, affording researchers and scholars an insightful view of the evolution of the field.
In this paper, we present a novel method for detecting fake and Large Language Model (LLM)-generated profiles in the LinkedIn Online Social Network immediately upon registration and before establishing connections. Early fake profile identification is crucial to maintaining the platform's integrity since it prevents imposters from acquiring the private and sensitive information of legitimate users and from gaining an opportunity to increase their credibility for future phishing and scamming activities. This work uses textual information provided in LinkedIn profiles and introduces the Section and Subsection Tag Embedding (SSTE) method to enhance the discriminative characteristics of these data for distinguishing between legitimate profiles and those created by imposters manually or by using an LLM. Additionally, the dearth of a large publicly available LinkedIn dataset motivated us to collect 3600 LinkedIn profiles for our research. We will release our dataset publicly for research purposes. This is, to the best of our knowledge, the first large publicly available LinkedIn dataset for fake LinkedIn account detection. Within our paradigm, we assess static and contextualized word embeddings, including GloVe, Flair, BERT, and RoBERTa. We show that the suggested method can distinguish between legitimate and fake profiles with an accuracy of about 95% across all word embeddings. In addition, we show that SSTE has a promising accuracy for identifying LLM-generated profiles, despite the fact that no LLM-generated profiles were employed during the training phase, and can achieve an accuracy of approximately 90% when only 20 LLM-generated profiles are added to the training set. It is a significant finding since the proliferation of several LLMs in the near future makes it extremely challenging to design a single system that can identify profiles created with various LLMs.
Detecting the salient parts of motor-imagery electroencephalogram (MI-EEG) signals can enhance the performance of the brain-computer interface (BCI) system and reduce the computational burden required for processing lengthy MI-EEG signals. In this paper, we propose an unsupervised method based on the self-attention mechanism to detect the salient intervals of MI-EEG signals automatically. Our suggested method can be used as a preprocessing step within any BCI algorithm to enhance its performance. The effectiveness of the suggested method is evaluated on the most widely used BCI algorithm, the common spatial pattern (CSP) algorithm, using dataset 2a from BCI competition IV. The results indicate that the proposed method can effectively prune MI-EEG signals and significantly enhance the performance of the CSP algorithm in terms of classification accuracy.
A calibration procedure is required in motor imagery-based brain-computer interface (MI-BCI) to tune the system for new users. This procedure is time-consuming and prevents na\"ive users from using the system immediately. Developing a subject-independent MI-BCI system to reduce the calibration phase is still challenging due to the subject-dependent characteristics of the MI signals. Many algorithms based on machine learning and deep learning have been developed to extract high-level features from the MI signals to improve the subject-to-subject generalization of a BCI system. However, these methods are based on supervised learning and extract features useful for discriminating various MI signals. Hence, these approaches cannot find the common underlying patterns in the MI signals and their generalization level is limited. This paper proposes a subject-independent MI-BCI based on a supervised autoencoder (SAE) to circumvent the calibration phase. The suggested framework is validated on dataset 2a from BCI competition IV. The simulation results show that our SISAE model outperforms the conventional and widely used BCI algorithms, common spatial and filter bank common spatial patterns, in terms of the mean Kappa value, in eight out of nine subjects.
In a self-paced motor-imagery brain-computer interface (MI-BCI), the onsets of the MI commands presented in a continuous electroencephalogram (EEG) signal are unknown. To detect these onsets, most self-paced approaches apply a window function on the continuous EEG signal and split it into long segments for further analysis. As a result, the system has a high latency. To reduce the system latency, we propose an algorithm based on the time series prediction concept and use the data of the previously received time samples to predict the upcoming time samples. Our predictor is an encoder-decoder (ED) network built with long short-term memory (LSTM) units. The onsets of the MI commands are detected shortly by comparing the incoming signal with the predicted signal. The proposed method is validated on dataset IVc from BCI competition III. The simulation results show that the proposed algorithm improves the average F1-score achieved by the winner of the competition by 26.7% for latencies shorter than one second.