Abstract:Existing studies on identifying outliers in wind speed-power datasets are often challenged by the complicated and irregular distributions of outliers, especially those being densely stacked yet staying close to normal data. This could degrade their identification reliability and robustness in practice. To address this defect, this paper develops a three-stage composite outlier identification method by systematically integrating three complementary techniques, i.e., physical rule-based preprocessing, regression learning-enabled detection, and mathematical morphology-based refinement. Firstly, the raw wind speed-power data are preprocessed via a set of simple yet efficient physical rules to filter out some outliers obviously going against the physical operating laws of practical wind turbines. Secondly, a robust wind speed-power regression learning model is built upon the random sample consensus algorithm. This model is able to reliably detect most outliers with the help of an adaptive threshold automatically set by the interquartile range method. Thirdly, by representing the wind speed-power data distribution with a two-dimensional image, mathematical morphology operations are applied to perform refined outlier identification from a data distribution perspective. This technique can identify outliers that are not effectively detected in the first two stages, including those densely stacked ones near normal data points. By integrating the above three techniques, the whole method is capable of identifying various types of outliers in a reliable and adaptive manner. Numerical test results with wind power datasets acquired from distinct wind turbines in practice and from simulation environments extensively demonstrate the superiority of the proposed method as well as its potential in enhancing wind power prediction.