Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Justus von Brandt

Detecting Unknown DGAs without Context Information

May 30, 2022

Arthur Drichel, Justus von Brandt, Ulrike Meyer

Figure 1 for Detecting Unknown DGAs without Context Information

Figure 2 for Detecting Unknown DGAs without Context Information

Figure 3 for Detecting Unknown DGAs without Context Information

Figure 4 for Detecting Unknown DGAs without Context Information

Abstract:New malware emerges at a rapid pace and often incorporates Domain Generation Algorithms (DGAs) to avoid blocking the malware's connection to the command and control (C2) server. Current state-of-the-art classifiers are able to separate benign from malicious domains (binary classification) and attribute them with high probability to the DGAs that generated them (multiclass classification). While binary classifiers can label domains of yet unknown DGAs as malicious, multiclass classifiers can only assign domains to DGAs that are known at the time of training, limiting the ability to uncover new malware families. In this work, we perform a comprehensive study on the detection of new DGAs, which includes an evaluation of 59,690 classifiers. We examine four different approaches in 15 different configurations and propose a simple yet effective approach based on the combination of a softmax classifier and regular expressions (regexes) to detect multiple unknown DGAs with high probability. At the same time, our approach retains state-of-the-art classification performance for known DGAs. Our evaluation is based on a leave-one-group-out cross-validation with a total of 94 DGA families. By using the maximum number of known DGAs, our evaluation scenario is particularly difficult and close to the real world. All of the approaches examined are privacy-preserving, since they operate without context and exclusively on a single domain to be classified. We round up our study with a thorough discussion of class-incremental learning strategies that can adapt an existing classifier to newly discovered classes.

* Accepted at The 17th International Conference on Availability, Reliability and Security (ARES 2022)

Via

Access Paper or Ask Questions

The More, the Better? A Study on Collaborative Machine Learning for DGA Detection

Sep 24, 2021

Arthur Drichel, Benedikt Holmes, Justus von Brandt, Ulrike Meyer

Figure 1 for The More, the Better? A Study on Collaborative Machine Learning for DGA Detection

Figure 2 for The More, the Better? A Study on Collaborative Machine Learning for DGA Detection

Figure 3 for The More, the Better? A Study on Collaborative Machine Learning for DGA Detection

Figure 4 for The More, the Better? A Study on Collaborative Machine Learning for DGA Detection

Abstract:Domain generation algorithms (DGAs) prevent the connection between a botnet and its master from being blocked by generating a large number of domain names. Promising single-data-source approaches have been proposed for separating benign from DGA-generated domains. Collaborative machine learning (ML) can be used in order to enhance a classifier's detection rate, reduce its false positive rate (FPR), and to improve the classifier's generalization capability to different networks. In this paper, we complement the research area of DGA detection by conducting a comprehensive collaborative learning study, including a total of 13,440 evaluation runs. In two real-world scenarios we evaluate a total of eleven different variations of collaborative learning using three different state-of-the-art classifiers. We show that collaborative ML can lead to a reduction in FPR by up to 51.7%. However, while collaborative ML is beneficial for DGA detection, not all approaches and classifier types profit equally. We round up our comprehensive study with a thorough discussion of the privacy threats implicated by the different collaborative ML approaches.

* Accepted at The 3rd Workshop on Cyber-Security Arms Race (CYSARM '21)

Via

Access Paper or Ask Questions

Finding Phish in a Haystack: A Pipeline for Phishing Classification on Certificate Transparency Logs

Jun 23, 2021

Arthur Drichel, Vincent Drury, Justus von Brandt, Ulrike Meyer

Figure 1 for Finding Phish in a Haystack: A Pipeline for Phishing Classification on Certificate Transparency Logs

Figure 2 for Finding Phish in a Haystack: A Pipeline for Phishing Classification on Certificate Transparency Logs

Figure 3 for Finding Phish in a Haystack: A Pipeline for Phishing Classification on Certificate Transparency Logs

Figure 4 for Finding Phish in a Haystack: A Pipeline for Phishing Classification on Certificate Transparency Logs

Abstract:Current popular phishing prevention techniques mainly utilize reactive blocklists, which leave a ``window of opportunity'' for attackers during which victims are unprotected. One possible approach to shorten this window aims to detect phishing attacks earlier, during website preparation, by monitoring Certificate Transparency (CT) logs. Previous attempts to work with CT log data for phishing classification exist, however they lack evaluations on actual CT log data. In this paper, we present a pipeline that facilitates such evaluations by addressing a number of problems when working with CT log data. The pipeline includes dataset creation, training, and past or live classification of CT logs. Its modular structure makes it possible to easily exchange classifiers or verification sources to support ground truth labeling efforts and classifier comparisons. We test the pipeline on a number of new and existing classifiers, and find a general potential to improve classifiers for this scenario in the future. We publish the source code of the pipeline and the used datasets along with this paper (https://gitlab.com/rwth-itsec/ctl-pipeline), thus making future research in this direction more accessible.

* Accepted at The 16th International Conference on Availability, Reliability and Security (ARES 2021)

Via

Access Paper or Ask Questions