Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Maciej K. Hryniewicki

XGBOD: Improving Supervised Outlier Detection with Unsupervised Representation Learning

Dec 01, 2019

Yue Zhao, Maciej K. Hryniewicki

Figure 1 for XGBOD: Improving Supervised Outlier Detection with Unsupervised Representation Learning

Figure 2 for XGBOD: Improving Supervised Outlier Detection with Unsupervised Representation Learning

Figure 3 for XGBOD: Improving Supervised Outlier Detection with Unsupervised Representation Learning

Figure 4 for XGBOD: Improving Supervised Outlier Detection with Unsupervised Representation Learning

Abstract:A new semi-supervised ensemble algorithm called XGBOD (Extreme Gradient Boosting Outlier Detection) is proposed, described and demonstrated for the enhanced detection of outliers from normal observations in various practical datasets. The proposed framework combines the strengths of both supervised and unsupervised machine learning methods by creating a hybrid approach that exploits each of their individual performance capabilities in outlier detection. XGBOD uses multiple unsupervised outlier mining algorithms to extract useful representations from the underlying data that augment the predictive capabilities of an embedded supervised classifier on an improved feature space. The novel approach is shown to provide superior performance in comparison to competing individual detectors, the full ensemble and two existing representation learning based algorithms across seven outlier datasets.

* Proceedings of the 2018 International Joint Conference on Neural Networks (IJCNN)

Via

Access Paper or Ask Questions

DCSO: Dynamic Combination of Detector Scores for Outlier Ensembles

Nov 23, 2019

Yue Zhao, Maciej K. Hryniewicki

Figure 1 for DCSO: Dynamic Combination of Detector Scores for Outlier Ensembles

Figure 2 for DCSO: Dynamic Combination of Detector Scores for Outlier Ensembles

Figure 3 for DCSO: Dynamic Combination of Detector Scores for Outlier Ensembles

Figure 4 for DCSO: Dynamic Combination of Detector Scores for Outlier Ensembles

Abstract:Selecting and combining the outlier scores of different base detectors used within outlier ensembles can be quite challenging in the absence of ground truth. In this paper, an unsupervised outlier detector combination framework called DCSO is proposed, demonstrated and assessed for the dynamic selection of most competent base detectors, with an emphasis on data locality. The proposed DCSO framework first defines the local region of a test instance by its k nearest neighbors and then identifies the top-performing base detectors within the local region. Experimental results on ten benchmark datasets demonstrate that DCSO provides consistent performance improvement over existing static combination approaches in mining outlying objects. To facilitate interpretability and reliability of the proposed method, DCSO is analyzed using both theoretical frameworks and visualization techniques, and presented alongside empirical parameter setting instructions that can be used to improve the overall performance.

* ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), Outlier Detection De-constructed Workshop, 2018

Via

Access Paper or Ask Questions

LSCP: Locally Selective Combination in Parallel Outlier Ensembles

Dec 04, 2018

Yue Zhao, Maciej K. Hryniewicki, Zain Nasrullah, Zheng Li

Figure 1 for LSCP: Locally Selective Combination in Parallel Outlier Ensembles

Figure 2 for LSCP: Locally Selective Combination in Parallel Outlier Ensembles

Figure 3 for LSCP: Locally Selective Combination in Parallel Outlier Ensembles

Figure 4 for LSCP: Locally Selective Combination in Parallel Outlier Ensembles

Abstract:In unsupervised outlier ensembles, the absence of ground truth makes the combination of base detectors a challenging task. Specifically, existing parallel outlier ensembles lack a reliable way of selecting competent base detectors, affecting accuracy and stability, during model combination. In this paper, we propose a framework---called Locally Selective Combination in Parallel Outlier Ensembles (LSCP)---which addresses this issue by defining a local region around a test instance using the consensus of its nearest neighbors in randomly generated feature spaces. The top-performing base detectors in this local region are selected and combined as the model's final output. Four variants of the LSCP framework are compared with six widely used combination algorithms for parallel ensembles. Experimental results demonstrate that one of these LSCP variants consistently outperforms baseline algorithms on the majority of eighteen real-world datasets.

* Under submission in 2019 SIAM Data Mining conference (SDM'19)

Via

Access Paper or Ask Questions