Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Michele Carminati

LeakSealer: A Semisupervised Defense for LLMs Against Prompt Injection and Leakage Attacks

Aug 01, 2025

Francesco Panebianco, Stefano Bonfanti, Francesco Trovò, Michele Carminati

Abstract:The generalization capabilities of Large Language Models (LLMs) have led to their widespread deployment across various applications. However, this increased adoption has introduced several security threats, notably in the forms of jailbreaking and data leakage attacks. Additionally, Retrieval Augmented Generation (RAG), while enhancing context-awareness in LLM responses, has inadvertently introduced vulnerabilities that can result in the leakage of sensitive information. Our contributions are twofold. First, we introduce a methodology to analyze historical interaction data from an LLM system, enabling the generation of usage maps categorized by topics (including adversarial interactions). This approach further provides forensic insights for tracking the evolution of jailbreaking attack patterns. Second, we propose LeakSealer, a model-agnostic framework that combines static analysis for forensic insights with dynamic defenses in a Human-In-The-Loop (HITL) pipeline. This technique identifies topic groups and detects anomalous patterns, allowing for proactive defense mechanisms. We empirically evaluate LeakSealer under two scenarios: (1) jailbreak attempts, employing a public benchmark dataset, and (2) PII leakage, supported by a curated dataset of labeled LLM interactions. In the static setting, LeakSealer achieves the highest precision and recall on the ToxicChat dataset when identifying prompt injection. In the dynamic setting, PII leakage detection achieves an AUPRC of $0.97$, significantly outperforming baselines such as Llama Guard.

* 22 pages, preprint

Via

Access Paper or Ask Questions

Assessing the Resilience of Automotive Intrusion Detection Systems to Adversarial Manipulation

Jun 12, 2025

Stefano Longari, Paolo Cerracchio, Michele Carminati, Stefano Zanero

Abstract:The security of modern vehicles has become increasingly important, with the controller area network (CAN) bus serving as a critical communication backbone for various Electronic Control Units (ECUs). The absence of robust security measures in CAN, coupled with the increasing connectivity of vehicles, makes them susceptible to cyberattacks. While intrusion detection systems (IDSs) have been developed to counter such threats, they are not foolproof. Adversarial attacks, particularly evasion attacks, can manipulate inputs to bypass detection by IDSs. This paper extends our previous work by investigating the feasibility and impact of gradient-based adversarial attacks performed with different degrees of knowledge against automotive IDSs. We consider three scenarios: white-box (attacker with full system knowledge), grey-box (partial system knowledge), and the more realistic black-box (no knowledge of the IDS' internal workings or data). We evaluate the effectiveness of the proposed attacks against state-of-the-art IDSs on two publicly available datasets. Additionally, we study effect of the adversarial perturbation on the attack impact and evaluate real-time feasibility by precomputing evasive payloads for timed injection based on bus traffic. Our results demonstrate that, besides attacks being challenging due to the automotive domain constraints, their effectiveness is strongly dependent on the dataset quality, the target IDS, and the attacker's degree of knowledge.

Via

Access Paper or Ask Questions

TimberStrike: Dataset Reconstruction Attack Revealing Privacy Leakage in Federated Tree-Based Systems

Jun 09, 2025

Marco Di Gennaro, Giovanni De Lucia, Stefano Longari, Stefano Zanero, Michele Carminati

Abstract:Federated Learning has emerged as a privacy-oriented alternative to centralized Machine Learning, enabling collaborative model training without direct data sharing. While extensively studied for neural networks, the security and privacy implications of tree-based models remain underexplored. This work introduces TimberStrike, an optimization-based dataset reconstruction attack targeting horizontally federated tree-based models. Our attack, carried out by a single client, exploits the discrete nature of decision trees by using split values and decision paths to infer sensitive training data from other clients. We evaluate TimberStrike on State-of-the-Art federated gradient boosting implementations across multiple frameworks, including Flower, NVFlare, and FedTree, demonstrating their vulnerability to privacy breaches. On a publicly available stroke prediction dataset, TimberStrike consistently reconstructs between 73.05% and 95.63% of the target dataset across all implementations. We further analyze Differential Privacy, showing that while it partially mitigates the attack, it also significantly degrades model performance. Our findings highlight the need for privacy-preserving mechanisms specifically designed for tree-based Federated Learning systems, and we provide preliminary insights into their design.

* Proceedings on Privacy Enhancing Technologies (To appear) 2025(4)

Via

Access Paper or Ask Questions

An Anomaly Detection System Based on Generative Classifiers for Controller Area Network

Dec 28, 2024

Chunheng Zhao, Stefano Longari, Michele Carminati, Pierluigi Pisu

Figure 1 for An Anomaly Detection System Based on Generative Classifiers for Controller Area Network

Figure 2 for An Anomaly Detection System Based on Generative Classifiers for Controller Area Network

Figure 3 for An Anomaly Detection System Based on Generative Classifiers for Controller Area Network

Figure 4 for An Anomaly Detection System Based on Generative Classifiers for Controller Area Network

Abstract:As electronic systems become increasingly complex and prevalent in modern vehicles, securing onboard networks is crucial, particularly as many of these systems are safety-critical. Researchers have demonstrated that modern vehicles are susceptible to various types of attacks, enabling attackers to gain control and compromise safety-critical electronic systems. Consequently, several Intrusion Detection Systems (IDSs) have been proposed in the literature to detect such cyber-attacks on vehicles. This paper introduces a novel generative classifier-based Intrusion Detection System (IDS) designed for anomaly detection in automotive networks, specifically focusing on the Controller Area Network (CAN). Leveraging variational Bayes, our proposed IDS utilizes a deep latent variable model to construct a causal graph for conditional probabilities. An auto-encoder architecture is utilized to build the classifier to estimate conditional probabilities, which contribute to the final prediction probabilities through Bayesian inference. Comparative evaluations against state-of-the-art IDSs on a public Car-hacking dataset highlight our proposed classifier's superior performance in improving detection accuracy and F1-score. The proposed IDS demonstrates its efficacy by outperforming existing models with limited training data, providing enhanced security assurance for automotive systems.

Via

Access Paper or Ask Questions

A Secure and Trustworthy Network Architecture for Federated Learning Healthcare Applications

Apr 17, 2024

Antonio Boiano, Marco Di Gennaro, Luca Barbieri, Michele Carminati, Monica Nicoli, Alessandro Redondi, Stefano Savazzi, Albert Sund Aillet, Diogo Reis Santos, Luigi Serio

Figure 1 for A Secure and Trustworthy Network Architecture for Federated Learning Healthcare Applications

Figure 2 for A Secure and Trustworthy Network Architecture for Federated Learning Healthcare Applications

Figure 3 for A Secure and Trustworthy Network Architecture for Federated Learning Healthcare Applications

Figure 4 for A Secure and Trustworthy Network Architecture for Federated Learning Healthcare Applications

Abstract:Federated Learning (FL) has emerged as a promising approach for privacy-preserving machine learning, particularly in sensitive domains such as healthcare. In this context, the TRUSTroke project aims to leverage FL to assist clinicians in ischemic stroke prediction. This paper provides an overview of the TRUSTroke FL network infrastructure. The proposed architecture adopts a client-server model with a central Parameter Server (PS). We introduce a Docker-based design for the client nodes, offering a flexible solution for implementing FL processes in clinical settings. The impact of different communication protocols (HTTP or MQTT) on FL network operation is analyzed, with MQTT selected for its suitability in FL scenarios. A control plane to support the main operations required by FL processes is also proposed. The paper concludes with an analysis of security aspects of the FL architecture, addressing potential threats and proposing mitigation strategies to increase the trustworthiness level.

Via

Access Paper or Ask Questions

Real-time Evasion Attacks with Physical Constraints on Deep Learning-based Anomaly Detectors in Industrial Control Systems

Jul 17, 2019

Alessandro Erba, Riccardo Taormina, Stefano Galelli, Marcello Pogliani, Michele Carminati, Stefano Zanero, Nils Ole Tippenhauer

Figure 1 for Real-time Evasion Attacks with Physical Constraints on Deep Learning-based Anomaly Detectors in Industrial Control Systems

Figure 2 for Real-time Evasion Attacks with Physical Constraints on Deep Learning-based Anomaly Detectors in Industrial Control Systems

Figure 3 for Real-time Evasion Attacks with Physical Constraints on Deep Learning-based Anomaly Detectors in Industrial Control Systems

Figure 4 for Real-time Evasion Attacks with Physical Constraints on Deep Learning-based Anomaly Detectors in Industrial Control Systems

Abstract:Recently, a number of deep learning-based anomaly detection algorithms were proposed to detect attacks in dynamic industrial control systems. The detectors operate on measured sensor data, leveraging physical process models learned a priori. Evading detection by such systems is challenging, as an attacker needs to manipulate a constrained number of sensor readings in real-time with realistic perturbations according to the current state of the system. In this work, we propose a number of evasion attacks (with different assumptions on the attacker's knowledge), and compare the attacks' cost and efficiency against replay attacks. In particular, we show that a replay attack on a subset of sensor values can be detected easily as it violates physical constraints. In contrast, our proposed attacks leverage manipulated sensor readings that observe learned physical constraints of the system. Our proposed white box attacker uses an optimization approach with a detection oracle, while our black box attacker uses an autoencoder (or a convolutional neural network) to translate anomalous data into normal data. Our proposed approaches are implemented and evaluated on two different datasets pertaining to the domain of water distribution networks. We then demonstrated the efficacy of the real-time attack on a realistic testbed. Results show that the accuracy of the detection algorithms can be significantly reduced through real-time adversarial actions: for the BATADAL dataset, the attacker can reduce the detection accuracy from 0.6 to 0.14. In addition, we discuss and implement an Availability attack, in which the attacker introduces detection events with minimal changes of the reported data, in order to reduce confidence in the detector.

Via

Access Paper or Ask Questions