Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Matteo Boffa

Improving Generalization on Cybersecurity Tasks with Multi-Modal Contrastive Learning

Mar 20, 2026

Jianan Huang, Rodolfo V. Valentim, Luca Vassio, Matteo Boffa, Marco Mellia, Idilio Drago, Dario Rossi

Abstract:The use of ML in cybersecurity has long been impaired by generalization issues: Models that work well in controlled scenarios fail to maintain performance in production. The root cause often lies in ML algorithms learning superficial patterns (shortcuts) rather than underlying cybersecurity concepts. We investigate contrastive multi-modal learning as a first step towards improving ML performance in cybersecurity tasks. We aim at transferring knowledge from data-rich modalities, such as text, to data-scarce modalities, such as payloads. We set up a case study on threat classification and propose a two-stage multi-modal contrastive learning framework that uses textual vulnerability descriptions to guide payload classification. First, we construct a semantically meaningful embedding space using contrastive learning on descriptions. Then, we align payloads to this space, transferring knowledge from text to payloads. We evaluate the approach on a large-scale private dataset and a synthetic benchmark built from public CVE descriptions and LLM-generated payloads. The methodology appears to reduce shortcut learning over baselines on both benchmarks. We release our synthetic benchmark and source code as open source.

* Submitted to Euro S&P - 5th International Workshop on Designing and Measuring Security in Systems with AI

Via

Access Paper or Ask Questions

Towards Agentic Honeynet Configuration

Mar 14, 2026

Federico Mirra, Matteo Boffa, Idilio Drago, Danilo Giordano, Marco Mellia

Abstract:Honeypots are deception systems that emulate vulnerable services to collect threat intelligence. While deploying many honeypots increases the opportunity to observe attacker behaviour, in practise network and computational resources limit the number of honeypots that can be exposed. Hence, practitioners must select the assets to deploy, a decision that is typically made statically despite attackers' tactics evolving over time. This work investigates an AI-driven agentic architecture that autonomously manages honeypot exposure in response to ongoing attacks. The proposed agent analyses Intrusion Detection System (IDS) alerts and network state to infer the progression of the attack, identify compromised assets, and predict likely attacker targets. Based on this assessment, the agent dynamically reconfigures the system to maintain attacker engagement while minimizing unnecessary exposure. The approach is evaluated in a simulated environment where attackers execute Proof-of-Concept exploits for known CVEs. Preliminary results indicate that the agent can effectively infer the intent of the attacker and improve the efficiency of exposure under resource constraints

* Accepted at AgenNet 2026 - Colocated with NOMS 2026

Via

Access Paper or Ask Questions

The Sweet Danger of Sugar: Debunking Representation Learning for Encrypted Traffic Classification

Jul 22, 2025

Yuqi Zhao, Giovanni Dettori, Matteo Boffa, Luca Vassio, Marco Mellia

Abstract:Recently we have witnessed the explosion of proposals that, inspired by Language Models like BERT, exploit Representation Learning models to create traffic representations. All of them promise astonishing performance in encrypted traffic classification (up to 98% accuracy). In this paper, with a networking expert mindset, we critically reassess their performance. Through extensive analysis, we demonstrate that the reported successes are heavily influenced by data preparation problems, which allow these models to find easy shortcuts - spurious correlation between features and labels - during fine-tuning that unrealistically boost their performance. When such shortcuts are not present - as in real scenarios - these models perform poorly. We also introduce Pcap-Encoder, an LM-based representation learning model that we specifically design to extract features from protocol headers. Pcap-Encoder appears to be the only model that provides an instrumental representation for traffic classification. Yet, its complexity questions its applicability in practical settings. Our findings reveal flaws in dataset preparation and model training, calling for a better and more conscious test design. We propose a correct evaluation methodology and stress the need for rigorous benchmarking.

* This paper has been accepted at ACM SIGCOMM 2025. It will appear in the proceedings with DOI 10.1145/3718958.3750498

Via

Access Paper or Ask Questions

LogPrécis: Unleashing Language Models for Automated Shell Log Analysis

Jul 17, 2023

Matteo Boffa, Rodolfo Vieira Valentim, Luca Vassio, Danilo Giordano, Idilio Drago, Marco Mellia, Zied Ben Houidi

Figure 1 for LogPrécis: Unleashing Language Models for Automated Shell Log Analysis

Figure 2 for LogPrécis: Unleashing Language Models for Automated Shell Log Analysis

Figure 3 for LogPrécis: Unleashing Language Models for Automated Shell Log Analysis

Figure 4 for LogPrécis: Unleashing Language Models for Automated Shell Log Analysis

Abstract:The collection of security-related logs holds the key to understanding attack behaviors and diagnosing vulnerabilities. Still, their analysis remains a daunting challenge. Recently, Language Models (LMs) have demonstrated unmatched potential in understanding natural and programming languages. The question arises whether and how LMs could be also useful for security experts since their logs contain intrinsically confused and obfuscated information. In this paper, we systematically study how to benefit from the state-of-the-art in LM to automatically analyze text-like Unix shell attack logs. We present a thorough design methodology that leads to LogPr\'ecis. It receives as input raw shell sessions and automatically identifies and assigns the attacker tactic to each portion of the session, i.e., unveiling the sequence of the attacker's goals. We demonstrate LogPr\'ecis capability to support the analysis of two large datasets containing about 400,000 unique Unix shell attacks. LogPr\'ecis reduces them into about 3,000 fingerprints, each grouping sessions with the same sequence of tactics. The abstraction it provides lets the analyst better understand attacks, identify fingerprints, detect novelty, link similar attacks, and track families and mutations. Overall, LogPr\'ecis, released as open source, paves the way for better and more responsive defense against cyberattacks.

Via

Access Paper or Ask Questions

Neural combinatorial optimization beyond the TSP: Existing architectures under-represent graph structure

Jan 03, 2022

Matteo Boffa, Zied Ben Houidi, Jonatan Krolikowski, Dario Rossi

Figure 1 for Neural combinatorial optimization beyond the TSP: Existing architectures under-represent graph structure

Figure 2 for Neural combinatorial optimization beyond the TSP: Existing architectures under-represent graph structure

Figure 3 for Neural combinatorial optimization beyond the TSP: Existing architectures under-represent graph structure

Figure 4 for Neural combinatorial optimization beyond the TSP: Existing architectures under-represent graph structure

Abstract:Recent years have witnessed the promise that reinforcement learning, coupled with Graph Neural Network (GNN) architectures, could learn to solve hard combinatorial optimization problems: given raw input data and an evaluator to guide the process, the idea is to automatically learn a policy able to return feasible and high-quality outputs. Recent work have shown promising results but the latter were mainly evaluated on the travelling salesman problem (TSP) and similar abstract variants such as Split Delivery Vehicle Routing Problem (SDVRP). In this paper, we analyze how and whether recent neural architectures can be applied to graph problems of practical importance. We thus set out to systematically "transfer" these architectures to the Power and Channel Allocation Problem (PCAP), which has practical relevance for, e.g., radio resource allocation in wireless networks. Our experimental results suggest that existing architectures (i) are still incapable of capturing graph structural features and (ii) are not suitable for problems where the actions on the graph change the graph attributes. On a positive note, we show that augmenting the structural representation of problems with Distance Encoding is a promising step towards the still-ambitious goal of learning multi-purpose autonomous solvers.

* AAAI'22 GCLR 2022 workshop on Graphs and more Complex structures for Learning and Reasoning AAAI'22 GCLR 2022 workshop on Graphs and more Complex structures for Learning and Reasoning
* 8 pages, 7 figures, accepted at AAAI'22 GCLR 2022 workshop on Graphs and more Complex structures for Learning and Reasoning

Via

Access Paper or Ask Questions