Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Kamel Smaili

BOUTEF: A Multilingual Corpus for FakeNews in North Africa -- Language as a Weapon

May 29, 2026

Kamel Smaili, Yassine Toughrai, Amina Laggoun, David Langlois

Abstract:The rapid spread of fake news on social media has become a major challenge, particularly in multilingual and under-resourced contexts such as North Africa. In this paper, we introduce BOUTEF, a large-scale multilingual corpus designed to study the propagation, characteristics, and impact of fake news in Algeria and Tunisia. The corpus integrates three complementary components: fake narratives, genuine narratives, and associated user-generated comments, along with verified debunking information. It covers a wide range of languages and linguistic varieties, including MSA, Algerian and Tunisian dialects, Arabizi, French, English, and code-switched language. Building on this resource, we conduct a comprehensive empirical analysis combining quantitative and qualitative approaches. We examine thematic distributions, linguistic and rhetorical strategies, sentiment patterns, and social engagement dynamics. Statistical analyses reveal significant associations between thematic categories and message veracity, as well as strong correlations between user engagement and the visibility of fake content. Our findings show that fake news relies heavily on emotionally charged narratives, sensational framing, and hybrid linguistic practices that enhance virality and audience engagement. In contrast, debunking content adopts a more factual and verification-oriented style. Furthermore, a comparative analysis between Algeria and Tunisia highlights both shared dynamics and country-specific characteristics shaped by sociopolitical contexts. The results emphasize the role of informal language practices in the diffusion and reception of misinformation. By providing a rich, annotated, and publicly available dataset, this work contributes to advancing research on fake news detection, low-resource language processing, and the understanding of information disorders in complex linguistic environments.

Via

Access Paper or Ask Questions

Supervised Contrastive Learning as Multi-Objective Optimization for Fine-Tuning Large Pre-trained Language Models

Sep 28, 2022

Youness Moukafih, Mounir Ghogho, Kamel Smaili

Figure 1 for Supervised Contrastive Learning as Multi-Objective Optimization for Fine-Tuning Large Pre-trained Language Models

Figure 2 for Supervised Contrastive Learning as Multi-Objective Optimization for Fine-Tuning Large Pre-trained Language Models

Figure 3 for Supervised Contrastive Learning as Multi-Objective Optimization for Fine-Tuning Large Pre-trained Language Models

Figure 4 for Supervised Contrastive Learning as Multi-Objective Optimization for Fine-Tuning Large Pre-trained Language Models

Abstract:Recently, Supervised Contrastive Learning (SCL) has been shown to achieve excellent performance in most classification tasks. In SCL, a neural network is trained to optimize two objectives: pull an anchor and positive samples together in the embedding space, and push the anchor apart from the negatives. However, these two different objectives may conflict, requiring trade-offs between them during optimization. In this work, we formulate the SCL problem as a Multi-Objective Optimization problem for the fine-tuning phase of RoBERTa language model. Two methods are utilized to solve the optimization problem: (i) the linear scalarization (LS) method, which minimizes a weighted linear combination of pertask losses; and (ii) the Exact Pareto Optimal (EPO) method which finds the intersection of the Pareto front with a given preference vector. We evaluate our approach on several GLUE benchmark tasks, without using data augmentations, memory banks, or generating adversarial examples. The empirical results show that the proposed learning strategy significantly outperforms a strong competitive contrastive learning baseline

Via

Access Paper or Ask Questions