Alert button
Picture for Markus Wagner

Markus Wagner

Alert button

CryptOpt: Automatic Optimization of Straightline Code

May 31, 2023
Joel Kuepper, Andres Erbsen, Jason Gross, Owen Conoly, Chuyue Sun, Samuel Tian, David Wu, Adam Chlipala, Chitchanok Chuengsatiansup, Daniel Genkin, Markus Wagner, Yuval Yarom

Figure 1 for CryptOpt: Automatic Optimization of Straightline Code
Figure 2 for CryptOpt: Automatic Optimization of Straightline Code
Figure 3 for CryptOpt: Automatic Optimization of Straightline Code
Figure 4 for CryptOpt: Automatic Optimization of Straightline Code

Manual engineering of high-performance implementations typically consumes many resources and requires in-depth knowledge of the hardware. Compilers try to address these problems; however, they are limited by design in what they can do. To address this, we present CryptOpt, an automatic optimizer for long stretches of straightline code. Experimental results across eight hardware platforms show that CryptOpt achieves a speed-up factor of up to 2.56 over current off-the-shelf compilers.

Viaarxiv icon

Assessing Domain Gap for Continual Domain Adaptation in Object Detection

Feb 21, 2023
Anh-Dzung Doan, Bach Long Nguyen, Surabhi Gupta, Ian Reid, Markus Wagner, Tat-Jun Chin

Figure 1 for Assessing Domain Gap for Continual Domain Adaptation in Object Detection
Figure 2 for Assessing Domain Gap for Continual Domain Adaptation in Object Detection
Figure 3 for Assessing Domain Gap for Continual Domain Adaptation in Object Detection
Figure 4 for Assessing Domain Gap for Continual Domain Adaptation in Object Detection

To ensure reliable object detection in autonomous systems, the detector must be able to adapt to changes in appearance caused by environmental factors such as time of day, weather, and seasons. Continually adapting the detector to incorporate these changes is a promising solution, but it can be computationally costly. Our proposed approach is to selectively adapt the detector only when necessary, using new data that does not have the same distribution as the current training data. To this end, we investigate three popular metrics for domain gap evaluation and find that there is a correlation between the domain gap and detection accuracy. Therefore, we apply the domain gap as a criterion to decide when to adapt the detector. Our experiments show that our approach has the potential to improve the efficiency of the detector's operation in real-world scenarios, where environmental conditions change in a cyclical manner, without sacrificing the overall performance of the detector. Our code is publicly available at https://github.com/dadung/DGE-CDA.

* Submitted to CVIU 
Viaarxiv icon

Socialz: Multi-Feature Social Fuzz Testing

Feb 17, 2023
Francisco Zanartu, Christoph Treude, Markus Wagner

Figure 1 for Socialz: Multi-Feature Social Fuzz Testing
Figure 2 for Socialz: Multi-Feature Social Fuzz Testing
Figure 3 for Socialz: Multi-Feature Social Fuzz Testing
Figure 4 for Socialz: Multi-Feature Social Fuzz Testing

Online social networks have become an integral aspect of our daily lives and play a crucial role in shaping our relationships with others. However, bugs and glitches, even minor ones, can cause anything from frustrating problems to serious data leaks that can have far-reaching impacts on millions of users. To mitigate these risks, fuzz testing, a method of testing with randomised inputs, can provide increased confidence in the correct functioning of a social network. However, implementing traditional fuzz testing methods can be prohibitively difficult or impractical for programmers outside of the network's development team. To tackle this challenge, we present Socialz, a novel approach to social fuzz testing that (1) characterises real users of a social network, (2) diversifies their interaction using evolutionary computation across multiple, non-trivial features, and (3) collects performance data as these interactions are executed. With Socialz, we aim to provide anyone with the capability to perform comprehensive social testing, thereby improving the reliability and security of online social networks used around the world.

Viaarxiv icon

ELEA -- Build your own Evolutionary Algorithm in your Browser

Feb 13, 2023
Markus Wagner, Erik Kohlros, Gerome Quantmeyer, Timo Kötzing

Figure 1 for ELEA -- Build your own Evolutionary Algorithm in your Browser
Figure 2 for ELEA -- Build your own Evolutionary Algorithm in your Browser
Figure 3 for ELEA -- Build your own Evolutionary Algorithm in your Browser
Figure 4 for ELEA -- Build your own Evolutionary Algorithm in your Browser

We provide an open source framework to experiment with evolutionary algorithms which we call "Experimenting and Learning toolkit for Evolutionary Algorithms (ELEA)". ELEA is browser-based and allows to assemble evolutionary algorithms using drag-and-drop, starting from a number of simple pre-designed examples, making the startup costs for employing the toolkit minimal. The designed examples can be executed and collected data can be displayed graphically. Further features include export of algorithm designs and experimental results as well as multi-threading. With the very intuitive user interface and the short time to get initial experiments going, this tool is especially suitable for explorative analyses of algorithms as well as for the use in classrooms.

* You can find a running instance of ELEA as well as its source code at the following URLs: https://elea-toolkit.netlify.app/ https://github.com/HPI-ELEA/elea 
Viaarxiv icon

CryptOpt: Verified Compilation with Random Program Search for Cryptographic Primitives

Nov 19, 2022
Joel Kuepper, Andres Erbsen, Jason Gross, Owen Conoly, Chuyue Sun, Samuel Tian, David Wu, Adam Chlipala, Chitchanok Chuengsatiansup, Daniel Genkin, Markus Wagner, Yuval Yarom

Figure 1 for CryptOpt: Verified Compilation with Random Program Search for Cryptographic Primitives
Figure 2 for CryptOpt: Verified Compilation with Random Program Search for Cryptographic Primitives
Figure 3 for CryptOpt: Verified Compilation with Random Program Search for Cryptographic Primitives
Figure 4 for CryptOpt: Verified Compilation with Random Program Search for Cryptographic Primitives

Most software domains rely on compilers to translate high-level code to multiple different machine languages, with performance not too much worse than what developers would have the patience to write directly in assembly language. However, cryptography has been an exception, where many performance-critical routines have been written directly in assembly (sometimes through metaprogramming layers). Some past work has shown how to do formal verification of that assembly, and other work has shown how to generate C code automatically along with formal proof, but with consequent performance penalties vs. the best-known assembly. We present CryptOpt, the first compilation pipeline that specializes high-level cryptographic functional programs into assembly code significantly faster than what GCC or Clang produce, with mechanized proof (in Coq) whose final theorem statement mentions little beyond the input functional program and the operational semantics of x86-64 assembly. On the optimization side, we apply randomized search through the space of assembly programs, with repeated automatic benchmarking on target CPUs. On the formal-verification side, we connect to the Fiat Cryptography framework (which translates functional programs into C-like IR code) and extend it with a new formally verified program-equivalence checker, incorporating a modest subset of known features of SMT solvers and symbolic-execution engines. The overall prototype is quite practical, e.g. producing new fastest-known implementations for the relatively new Intel i9 12G, of finite-field arithmetic for both Curve25519 (part of the TLS standard) and the Bitcoin elliptic curve secp256k1.

Viaarxiv icon

Fairness in generative modeling

Oct 06, 2022
Mariia Zameshina, Olivier Teytaud, Fabien Teytaud, Vlad Hosu, Nathanael Carraz, Laurent Najman, Markus Wagner

Figure 1 for Fairness in generative modeling

We design general-purpose algorithms for addressing fairness issues and mode collapse in generative modeling. More precisely, to design fair algorithms for as many sensitive variables as possible, including variables we might not be aware of, we assume no prior knowledge of sensitive variables: our algorithms use unsupervised fairness only, meaning no information related to the sensitive variables is used for our fairness-improving methods. All images of faces (even generated ones) have been removed to mitigate legal risks.

* GECCO '22: Genetic and Evolutionary Computation Conference, Jul 2022, Boston Massachusetts, France. pp.320-323  
Viaarxiv icon

Automatically Categorising GitHub Repositories by Application Domain

Jul 30, 2022
Francisco Zanartu, Christoph Treude, Bruno Cartaxo, Hudson Silva Borges, Pedro Moura, Markus Wagner, Gustavo Pinto

Figure 1 for Automatically Categorising GitHub Repositories by Application Domain
Figure 2 for Automatically Categorising GitHub Repositories by Application Domain
Figure 3 for Automatically Categorising GitHub Repositories by Application Domain
Figure 4 for Automatically Categorising GitHub Repositories by Application Domain

GitHub is the largest host of open source software on the Internet. This large, freely accessible database has attracted the attention of practitioners and researchers alike. But as GitHub's growth continues, it is becoming increasingly hard to navigate the plethora of repositories which span a wide range of domains. Past work has shown that taking the application domain into account is crucial for tasks such as predicting the popularity of a repository and reasoning about project quality. In this work, we build on a previously annotated dataset of 5,000 GitHub repositories to design an automated classifier for categorising repositories by their application domain. The classifier uses state-of-the-art natural language processing techniques and machine learning to learn from multiple data sources and catalogue repositories according to five application domains. We contribute with (1) an automated classifier that can assign popular repositories to each application domain with at least 70% precision, (2) an investigation of the approach's performance on less popular repositories, and (3) a practical application of this approach to answer how the adoption of software engineering practices differs across application domains. Our work aims to help the GitHub community identify repositories of interest and opens promising avenues for future work investigating differences between repositories from different application domains.

Viaarxiv icon

Is Surprisal in Issue Trackers Actionable?

Apr 15, 2022
James Caddy, Markus Wagner, Christoph Treude, Earl T. Barr, Miltiadis Allamanis

Figure 1 for Is Surprisal in Issue Trackers Actionable?
Figure 2 for Is Surprisal in Issue Trackers Actionable?

Background. From information theory, surprisal is a measurement of how unexpected an event is. Statistical language models provide a probabilistic approximation of natural languages, and because surprisal is constructed with the probability of an event occuring, it is therefore possible to determine the surprisal associated with English sentences. The issues and pull requests of software repository issue trackers give insight into the development process and likely contain the surprising events of this process. Objective. Prior works have identified that unusual events in software repositories are of interest to developers, and use simple code metrics-based methods for detecting them. In this study we will propose a new method for unusual event detection in software repositories using surprisal. With the ability to find surprising issues and pull requests, we intend to further analyse them to determine if they actually hold importance in a repository, or if they pose a significant challenge to address. If it is possible to find bad surprises early, or before they cause additional troubles, it is plausible that effort, cost and time will be saved as a result. Method. After extracting the issues and pull requests from 5000 of the most popular software repositories on GitHub, we will train a language model to represent these issues. We will measure their perceived importance in the repository, measure their resolution difficulty using several analogues, measure the surprisal of each, and finally generate inferential statistics to describe any correlations.

* 8 pages, 1 figure. Submitted to 2022 International Conference on Mining Software Repositories Registered Reports track 
Viaarxiv icon

On the Fitness Landscapes of Interdependency Models in the Travelling Thief Problem

Feb 28, 2022
Mohamed El Yafrani, Marcella Scoczynski, Myriam Delgado, Ricardo Lüders, Peter Nielsen, Markus Wagner

Figure 1 for On the Fitness Landscapes of Interdependency Models in the Travelling Thief Problem
Figure 2 for On the Fitness Landscapes of Interdependency Models in the Travelling Thief Problem
Figure 3 for On the Fitness Landscapes of Interdependency Models in the Travelling Thief Problem
Figure 4 for On the Fitness Landscapes of Interdependency Models in the Travelling Thief Problem

Since its inception in 2013, the Travelling Thief Problem (TTP) has been widely studied as an example of problems with multiple interconnected sub-problems. The dependency in this model arises when tying the travelling time of the "thief" to the weight of the knapsack. However, other forms of dependency as well as combinations of dependencies should be considered for investigation, as they are often found in complex real-world problems. Our goal is to study the impact of different forms of dependency in the TTP using a simple local search algorithm. To achieve this, we use Local Optima Networks, a technique for analysing the fitness landscape.

Viaarxiv icon

Run-of-Mine Stockyard Recovery Scheduling and Optimisation for Multiple Reclaimers

Dec 23, 2021
Hirad Assimi, Ben Koch, Chris Garcia, Markus Wagner, Frank Neumann

Figure 1 for Run-of-Mine Stockyard Recovery Scheduling and Optimisation for Multiple Reclaimers
Figure 2 for Run-of-Mine Stockyard Recovery Scheduling and Optimisation for Multiple Reclaimers
Figure 3 for Run-of-Mine Stockyard Recovery Scheduling and Optimisation for Multiple Reclaimers
Figure 4 for Run-of-Mine Stockyard Recovery Scheduling and Optimisation for Multiple Reclaimers

Stockpiles are essential in the mining value chain, assisting in maximising value and production. Quality control of taken minerals from the stockpiles is a major concern for stockpile managers where failure to meet some requirements can lead to losing money. This problem was recently investigated using a single reclaimer, and basic assumptions. This study extends the approach to consider multiple reclaimers in preparing for short and long-term deliveries. The engagement of multiple reclaimers complicates the problem in terms of their interaction in preparing a delivery simultaneously and safety distancing of reclaimers. We also consider more realistic settings, such as handling different minerals with different types of reclaimers. We propose methods that construct a solution step by step to meet precedence constraints for all reclaimers in the stockyard. We study various instances of the problem using greedy algorithms, Ant Colony Optimisation (ACO), and propose an integrated local search method determining an efficient schedule. We fine-tune and compare the algorithms and show that the ACO combined with local search can yield efficient solutions.

Viaarxiv icon