Abstract:Software-Defined Networking (SDN) improves network flexibility but also increases the need for reliable and interpretable intrusion detection. Large Language Models (LLMs) have recently been explored for cybersecurity tasks due to their strong representation learning capabilities; however, their lack of transparency limits their practical adoption in security-critical environments. Understanding how LLMs make decisions is therefore essential. This paper presents an attribution-driven analysis of encoder-based LLMs for network intrusion detection using flow-level traffic features. Attribution analysis demonstrates that model decisions are driven by meaningful traffic behavior patterns, improving transparency and trust in transformer-based SDN intrusion detection. These patterns align with established intrusion detection principles, indicating that LLMs learn attack behavior from traffic dynamics. This work demonstrates the value of attribution methods for validating and trusting LLM-based security analysis.
Abstract:This paper explores the relatively underexplored application of Positive Unlabeled (PU) Learning and Negative Unlabeled (NU) Learning in the cybersecurity domain. While these semi-supervised learning methods have been applied successfully in fields like medicine and marketing, their potential in cybersecurity remains largely untapped. The paper identifies key areas of cybersecurity--such as intrusion detection, vulnerability management, malware detection, and threat intelligence--where PU/NU learning can offer significant improvements, particularly in scenarios with imbalanced or limited labeled data. We provide a detailed problem formulation for each subfield, supported by mathematical reasoning, and highlight the specific challenges and research gaps in scaling these methods to real-time systems, addressing class imbalance, and adapting to evolving threats. Finally, we propose future directions to advance the integration of PU/NU learning in cybersecurity, offering solutions that can better detect, manage, and mitigate emerging cyber threats.




Abstract:This paper explores the application of Positive-Unlabeled (PU) learning for enhanced Distributed Denial-of-Service (DDoS) detection in cloud environments. Utilizing the $\texttt{BCCC-cPacket-Cloud-DDoS-2024}$ dataset, we implement PU learning with four machine learning algorithms: XGBoost, Random Forest, Support Vector Machine, and Na\"{i}ve Bayes. Our results demonstrate the superior performance of ensemble methods, with XGBoost and Random Forest achieving $F_{1}$ scores exceeding 98%. We quantify the efficacy of each approach using metrics including $F_{1}$ score, ROC AUC, Recall, and Precision. This study bridges the gap between PU learning and cloud-based anomaly detection, providing a foundation for addressing Context-Aware DDoS Detection in multi-cloud environments. Our findings highlight the potential of PU learning in scenarios with limited labeled data, offering valuable insights for developing more robust and adaptive cloud security mechanisms.