Instance selection is a vital technique for energy big data analytics. It is challenging to process a massive amount of streaming data generated at high speed rates by intelligent monitoring devices. Instance selection aims at removing noisy and bad data that can compromise the performance of data-driven learners. In this context, this paper proposes a novel similarity based instance selection (SIS) method for real-time phasor measurement unit data. In addition, we develop a variant of the Hoeffding-Tree learner enhanced with the SIS for classifying disturbances and cyber-attacks. We validate the merits of the proposed learner by exploring its performance under four scenarios that affect either the system physics or the monitoring architecture. Our experiments are simulated by using the datasets of industrial control system cyber-attacks. Finally, we conduct an implementation analysis which shows the deployment feasibility and high-performance potential of the proposed learner, as a part of real-time monitoring applications.
The emerging wide area monitoring systems (WAMS) have brought significant improvements in electric grids' situational awareness. However, the newly introduced system can potentially increase the risk of cyber-attacks, which may be disguised as normal physical disturbances. This paper deals with the event and intrusion detection problem by leveraging a stream data mining classifier (Hoeffding adaptive tree) with semi-supervised learning techniques to distinguish cyber-attacks from regular system perturbations accurately. First, our proposed approach builds a dictionary by learning higher-level features from unlabeled data. Then, the labeled data are represented as sparse linear combinations of learned dictionary atoms. We capitalize on those sparse codes to train the online classifier along with efficient change detectors. We conduct numerical experiments with industrial control systems cyber-attack datasets. We consider five different scenarios: short-circuit faults, line maintenance, remote tripping command injection, relay setting change, as well as false data injection. The data are generated based on a modified IEEE 9-bus system. Simulation results show that our proposed approach outperforms the state-of-the-art method.