Abstract:In High Energy Physics, as in many other fields of science, the application of machine learning techniques has been crucial in advancing our understanding of fundamental phenomena. Increasingly, deep learning models are applied to analyze both simulated and experimental data. In most experiments, a rigorous regime of testing for physically motivated systematic uncertainties is in place. The numerical evaluation of these tests for differences between the data on the one side and simulations on the other side quantifies the effect of potential sources of mismodelling on the machine learning output. In addition, thorough comparisons of marginal distributions and (linear) feature correlations between data and simulation in "control regions" are applied. However, the guidance by physical motivation, and the need to constrain comparisons to specific regions, does not guarantee that all possible sources of deviations have been accounted for. We therefore propose a new adversarial attack - the CONSERVAttack - designed to exploit the remaining space of hypothetical deviations between simulation and data after the above mentioned tests. The resulting adversarial perturbations are consistent within the uncertainty bounds - evading standard validation checks - while successfully fooling the underlying model. We further propose strategies to mitigate such vulnerabilities and argue that robustness to adversarial effects must be considered when interpreting results from deep learning in particle physics.
Abstract:Correlations between input parameters play a crucial role in many scientific classification tasks, since these are often related to fundamental laws of nature. For example, in high energy physics, one of the common deep learning use-cases is the classification of signal and background processes in particle collisions. In many such cases, the fundamental principles of the correlations between observables are often better understood than the actual distributions of the observables themselves. In this work, we present a new adversarial attack algorithm called Random Distribution Shuffle Attack (RDSA), emphasizing the correlations between observables in the network rather than individual feature characteristics. Correct application of the proposed novel attack can result in a significant improvement in classification performance - particularly in the context of data augmentation - when using the generated adversaries within adversarial training. Given that correlations between input features are also crucial in many other disciplines. We demonstrate the RDSA effectiveness on six classification tasks, including two particle collision challenges (using CERN Open Data), hand-written digit recognition (MNIST784), human activity recognition (HAR), weather forecasting (Rain in Australia), and ICU patient mortality (MIMIC-IV), demonstrating a general use case beyond fundamental physics for this new type of adversarial attack algorithms.