Alert button
Picture for Cristina Improta

Cristina Improta

Alert button

Vulnerabilities in AI Code Generators: Exploring Targeted Data Poisoning Attacks

Aug 04, 2023
Domenico Cotroneo, Cristina Improta, Pietro Liguori, Roberto Natella

Figure 1 for Vulnerabilities in AI Code Generators: Exploring Targeted Data Poisoning Attacks
Figure 2 for Vulnerabilities in AI Code Generators: Exploring Targeted Data Poisoning Attacks
Figure 3 for Vulnerabilities in AI Code Generators: Exploring Targeted Data Poisoning Attacks
Figure 4 for Vulnerabilities in AI Code Generators: Exploring Targeted Data Poisoning Attacks

In this work, we assess the security of AI code generators via data poisoning, i.e., an attack that injects malicious samples into the training data to generate vulnerable code. We poison the training data by injecting increasing amounts of code containing security vulnerabilities and assess the attack's success on different state-of-the-art models for code generation. Our analysis shows that AI code generators are vulnerable to even a small amount of data poisoning. Moreover, the attack does not impact the correctness of code generated by pre-trained models, making it hard to detect.

Viaarxiv icon

Enhancing Robustness of AI Offensive Code Generators via Data Augmentation

Jun 08, 2023
Cristina Improta, Pietro Liguori, Roberto Natella, Bojan Cukic, Domenico Cotroneo

Figure 1 for Enhancing Robustness of AI Offensive Code Generators via Data Augmentation
Figure 2 for Enhancing Robustness of AI Offensive Code Generators via Data Augmentation
Figure 3 for Enhancing Robustness of AI Offensive Code Generators via Data Augmentation
Figure 4 for Enhancing Robustness of AI Offensive Code Generators via Data Augmentation

In this work, we present a method to add perturbations to the code descriptions, i.e., new inputs in natural language (NL) from well-intentioned developers, in the context of security-oriented code, and analyze how and to what extent perturbations affect the performance of AI offensive code generators. Our experiments show that the performance of the code generators is highly affected by perturbations in the NL descriptions. To enhance the robustness of the code generators, we use the method to perform data augmentation, i.e., to increase the variability and diversity of the training data, proving its effectiveness against both perturbed and non-perturbed code descriptions.

Viaarxiv icon

Who Evaluates the Evaluators? On Automatic Metrics for Assessing AI-based Offensive Code Generators

Dec 12, 2022
Cristina Improta, Pietro Liguori, Roberto Natella, Bojan Cukic, Domenico Cotroneo

Figure 1 for Who Evaluates the Evaluators? On Automatic Metrics for Assessing AI-based Offensive Code Generators
Figure 2 for Who Evaluates the Evaluators? On Automatic Metrics for Assessing AI-based Offensive Code Generators
Figure 3 for Who Evaluates the Evaluators? On Automatic Metrics for Assessing AI-based Offensive Code Generators
Figure 4 for Who Evaluates the Evaluators? On Automatic Metrics for Assessing AI-based Offensive Code Generators

AI-based code generators are an emerging solution for automatically writing programs starting from descriptions in natural language, by using deep neural networks (Neural Machine Translation, NMT). In particular, code generators have been used for ethical hacking and offensive security testing by generating proof-of-concept attacks. Unfortunately, the evaluation of code generators still faces several issues. The current practice uses automatic metrics, which compute the textual similarity of generated code with ground-truth references. However, it is not clear what metric to use, and which metric is most suitable for specific contexts. This practical experience report analyzes a large set of output similarity metrics on offensive code generators. We apply the metrics on two state-of-the-art NMT models using two datasets containing offensive assembly and Python code with their descriptions in the English language. We compare the estimates from the automatic metrics with human evaluation and provide practical insights into their strengths and limitations.

Viaarxiv icon

Can NMT Understand Me? Towards Perturbation-based Evaluation of NMT Models for Code Generation

Mar 30, 2022
Pietro Liguori, Cristina Improta, Simona De Vivo, Roberto Natella, Bojan Cukic, Domenico Cotroneo

Figure 1 for Can NMT Understand Me? Towards Perturbation-based Evaluation of NMT Models for Code Generation
Figure 2 for Can NMT Understand Me? Towards Perturbation-based Evaluation of NMT Models for Code Generation
Figure 3 for Can NMT Understand Me? Towards Perturbation-based Evaluation of NMT Models for Code Generation

Neural Machine Translation (NMT) has reached a level of maturity to be recognized as the premier method for the translation between different languages and aroused interest in different research areas, including software engineering. A key step to validate the robustness of the NMT models consists in evaluating the performance of the models on adversarial inputs, i.e., inputs obtained from the original ones by adding small amounts of perturbation. However, when dealing with the specific task of the code generation (i.e., the generation of code starting from a description in natural language), it has not yet been defined an approach to validate the robustness of the NMT models. In this work, we address the problem by identifying a set of perturbations and metrics tailored for the robustness assessment of such models. We present a preliminary experimental evaluation, showing what type of perturbations affect the model the most and deriving useful insights for future directions.

* Paper accepted for publication in the proceedings of The 1st Intl. Workshop on Natural Language-based Software Engineering (NLBSE) to be held with ICSE 2022 
Viaarxiv icon