Abstract:Web security demands rapid response capabilities to evolving cyber threats. Agentic Artificial Intelligence (AI) promises automation, but the need for trustworthy security responses is of the utmost importance. This work investigates the role of semantic relations in extracting information for sensitive operational tasks, such as configuring security controls for mitigating threats. To this end, it proposes to leverage hypernym-hyponym textual relations to extract relevant information from Cyber Threat Intelligence (CTI) reports. By leveraging a neuro-symbolic approach, the multi-agent system automatically generates CLIPS code for an expert system creating firewall rules to block malicious network traffic. Experimental results show the superior performance of the hypernym-hyponym retrieval strategy compared to various baselines and the higher effectiveness of the agentic approach in mitigating threats.
Abstract:This work compares three pipelines for training transformer-based neural networks to produce machine translators for Bambara, a Mand\`e language spoken in Africa by about 14,188,850 people. The first pipeline trains a simple transformer to translate sentences from French into Bambara. The second fine-tunes LLaMA3 (3B-8B) instructor models using decoder-only architectures for French-to-Bambara translation. Models from the first two pipelines were trained with different hyperparameter combinations to improve BLEU and chrF scores, evaluated on both test sentences and official Bambara benchmarks. The third pipeline uses language distillation with a student-teacher dual neural network to integrate Bambara into a pre-trained LaBSE model, which provides language-agnostic embeddings. A BERT extension is then applied to LaBSE to generate translations. All pipelines were tested on Dokotoro (medical) and Bayelemagaba (mixed domains). Results show that the first pipeline, although simpler, achieves the best translation accuracy (10% BLEU, 21% chrF on Bayelemagaba), consistent with low-resource translation results. On the Yiri dataset, created for this work, it achieves 33.81% BLEU and 41% chrF. Instructor-based models perform better on single datasets than on aggregated collections, suggesting they capture dataset-specific patterns more effectively.