Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Detecting Clusters of Anomalies on Low-Dimensional Feature Subsets with Application to Network Traffic Flow Data

Jun 10, 2015

Zhicong Qiu, David J. Miller, George Kesidis

Figure 1 for Detecting Clusters of Anomalies on Low-Dimensional Feature Subsets with Application to Network Traffic Flow Data

Figure 2 for Detecting Clusters of Anomalies on Low-Dimensional Feature Subsets with Application to Network Traffic Flow Data

Share this with someone who'll enjoy it:

Abstract:In a variety of applications, one desires to detect groups of anomalous data samples, with a group potentially manifesting its atypicality (relative to a reference model) on a low-dimensional subset of the full measured set of features. Samples may only be weakly atypical individually, whereas they may be strongly atypical when considered jointly. What makes this group anomaly detection problem quite challenging is that it is a priori unknown which subset of features jointly manifests a particular group of anomalies. Moreover, it is unknown how many anomalous groups are present in a given data batch. In this work, we develop a group anomaly detection (GAD) scheme to identify the subset of samples and subset of features that jointly specify an anomalous cluster. We apply our approach to network intrusion detection to detect BotNet and peer-to-peer flow clusters. Unlike previous studies, our approach captures and exploits statistical dependencies that may exist between the measured features. Experiments on real world network traffic data demonstrate the advantage of our proposed system, and highlight the importance of exploiting feature dependency structure, compared to the feature (or test) independence assumption made in previous studies.

View paper on

Share this with someone who'll enjoy it:

Title:Detecting Clusters of Anomalies on Low-Dimensional Feature Subsets with Application to Network Traffic Flow Data

Paper and Code