Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Gerasimos Spanakis

Cross-Modal Robustness Transfer (CMRT): Training Robust Speech Translation Models Using Adversarial Text

Feb 12, 2026

Abderrahmane Issam, Yusuf Can Semerci, Jan Scholtes, Gerasimos Spanakis

Abstract:End-to-End Speech Translation (E2E-ST) has seen significant advancements, yet current models are primarily benchmarked on curated, "clean" datasets. This overlooks critical real-world challenges, such as morphological robustness to inflectional variations common in non-native or dialectal speech. In this work, we adapt a text-based adversarial attack targeting inflectional morphology to the speech domain and demonstrate that state-of-the-art E2E-ST models are highly vulnerable it. While adversarial training effectively mitigates such risks in text-based tasks, generating high-quality adversarial speech data remains computationally expensive and technically challenging. To address this, we propose Cross-Modal Robustness Transfer (CMRT), a framework that transfers adversarial robustness from the text modality to the speech modality. Our method eliminates the requirement for adversarial speech data during training. Extensive experiments across four language pairs demonstrate that CMRT improves adversarial robustness by an average of more than 3 BLEU points, establishing a new baseline for robust E2E-ST without the overhead of generating adversarial speech.

Via

Access Paper or Ask Questions

From FusHa to Folk: Exploring Cross-Lingual Transfer in Arabic Language Models

Feb 10, 2026

Abdulmuizz Khalak, Abderrahmane Issam, Gerasimos Spanakis

Abstract:Arabic Language Models (LMs) are pretrained predominately on Modern Standard Arabic (MSA) and are expected to transfer to its dialects. While MSA as the standard written variety is commonly used in formal settings, people speak and write online in various dialects that are spread across the Arab region. This poses limitations for Arabic LMs, since its dialects vary in their similarity to MSA. In this work we study cross-lingual transfer of Arabic models using probing on 3 Natural Language Processing (NLP) Tasks, and representational similarity. Our results indicate that transfer is possible but disproportionate across dialects, which we find to be partially explained by their geographic proximity. Furthermore, we find evidence for negative interference in models trained to support all Arabic dialects. This questions their degree of similarity, and raises concerns for cross-lingual transfer in Arabic models.

* Accepted to VarDial 2026

Via

Access Paper or Ask Questions

Language Models as Artificial Learners: Investigating Crosslinguistic Influence

Jan 29, 2026

Abderrahmane Issam, Yusuf Can Semerci, Jan Scholtes, Gerasimos Spanakis

Abstract:Despite the centrality of crosslinguistic influence (CLI) to bilingualism research, human studies often yield conflicting results due to inherent experimental variance. We address these inconsistencies by using language models (LMs) as controlled statistical learners to systematically simulate CLI and isolate its underlying drivers. Specifically, we study the effect of varying the L1 language dominance and the L2 language proficiency, which we manipulate by controlling the L2 age of exposure -- defined as the training step at which the L2 is introduced. Furthermore, we investigate the impact of pretraining on L1 languages with varying syntactic distance from the L2. Using cross-linguistic priming, we analyze how activating L1 structures impacts L2 processing. Our results align with evidence from psycholinguistic studies, confirming that language dominance and proficiency are strong predictors of CLI. We further find that while priming of grammatical structures is bidirectional, the priming of ungrammatical structures is sensitive to language dominance. Finally, we provide mechanistic evidence of CLI in LMs, demonstrating that the L1 is co-activated during L2 processing and directly influences the neural circuitry recruited for the L2. More broadly, our work demonstrates that LMs can serve as a computational framework to inform theories of human CLI.

Via

Access Paper or Ask Questions

You Are What You Train: Effects of Data Composition on Training Context-aware Machine Translation Models

Sep 17, 2025

Paweł Mąka, Yusuf Can Semerci, Jan Scholtes, Gerasimos Spanakis

Figure 1 for You Are What You Train: Effects of Data Composition on Training Context-aware Machine Translation Models

Figure 2 for You Are What You Train: Effects of Data Composition on Training Context-aware Machine Translation Models

Figure 3 for You Are What You Train: Effects of Data Composition on Training Context-aware Machine Translation Models

Figure 4 for You Are What You Train: Effects of Data Composition on Training Context-aware Machine Translation Models

Abstract:Achieving human-level translations requires leveraging context to ensure coherence and handle complex phenomena like pronoun disambiguation. Sparsity of contextually rich examples in the standard training data has been hypothesized as the reason for the difficulty of context utilization. In this work, we systematically validate this claim in both single- and multilingual settings by constructing training datasets with a controlled proportions of contextually relevant examples. We demonstrate a strong association between training data sparsity and model performance confirming sparsity as a key bottleneck. Importantly, we reveal that improvements in one contextual phenomenon do no generalize to others. While we observe some cross-lingual transfer, it is not significantly higher between languages within the same sub-family. Finally, we propose and empirically evaluate two training strategies designed to leverage the available data. These strategies improve context utilization, resulting in accuracy gains of up to 6 and 8 percentage points on the ctxPro evaluation in single- and multilingual settings respectively.

* EMNLP 2025 main conference

Via

Access Paper or Ask Questions

Are Companies Taking AI Risks Seriously? A Systematic Analysis of Companies' AI Risk Disclosures in SEC 10-K forms

Aug 26, 2025

Lucas G. Uberti-Bona Marin, Bram Rijsbosch, Gerasimos Spanakis, Konrad Kollnig

Abstract:As Artificial Intelligence becomes increasingly central to corporate strategies, concerns over its risks are growing too. In response, regulators are pushing for greater transparency in how companies identify, report and mitigate AI-related risks. In the US, the Securities and Exchange Commission (SEC) repeatedly warned companies to provide their investors with more accurate disclosures of AI-related risks; recent enforcement and litigation against companies' misleading AI claims reinforce these warnings. In the EU, new laws - like the AI Act and Digital Services Act - introduced additional rules on AI risk reporting and mitigation. Given these developments, it is essential to examine if and how companies report AI-related risks to the public. This study presents the first large-scale systematic analysis of AI risk disclosures in SEC 10-K filings, which require public companies to report material risks to their company. We analyse over 30,000 filings from more than 7,000 companies over the past five years, combining quantitative and qualitative analysis. Our findings reveal a sharp increase in the companies that mention AI risk, up from 4% in 2020 to over 43% in the most recent 2024 filings. While legal and competitive AI risks are the most frequently mentioned, we also find growing attention to societal AI risks, such as cyberattacks, fraud, and technical limitations of AI systems. However, many disclosures remain generic or lack details on mitigation strategies, echoing concerns raised recently by the SEC about the quality of AI-related risk reporting. To support future research, we publicly release a web-based tool for easily extracting and analysing keyword-based disclosures across SEC filings.

* To be published in the ECML PKDD SoGood (Data Science for Social Good) workshop proceedings

Via

Access Paper or Ask Questions

Dutch CrowS-Pairs: Adapting a Challenge Dataset for Measuring Social Biases in Language Models for Dutch

Jul 22, 2025

Elza Strazda, Gerasimos Spanakis

Figure 1 for Dutch CrowS-Pairs: Adapting a Challenge Dataset for Measuring Social Biases in Language Models for Dutch

Figure 2 for Dutch CrowS-Pairs: Adapting a Challenge Dataset for Measuring Social Biases in Language Models for Dutch

Figure 3 for Dutch CrowS-Pairs: Adapting a Challenge Dataset for Measuring Social Biases in Language Models for Dutch

Figure 4 for Dutch CrowS-Pairs: Adapting a Challenge Dataset for Measuring Social Biases in Language Models for Dutch

Abstract:Warning: This paper contains explicit statements of offensive stereotypes which might be upsetting. Language models are prone to exhibiting biases, further amplifying unfair and harmful stereotypes. Given the fast-growing popularity and wide application of these models, it is necessary to ensure safe and fair language models. As of recent considerable attention has been paid to measuring bias in language models, yet the majority of studies have focused only on English language. A Dutch version of the US-specific CrowS-Pairs dataset for measuring bias in Dutch language models is introduced. The resulting dataset consists of 1463 sentence pairs that cover bias in 9 categories, such as Sexual orientation, Gender and Disability. The sentence pairs are composed of contrasting sentences, where one of the sentences concerns disadvantaged groups and the other advantaged groups. Using the Dutch CrowS-Pairs dataset, we show that various language models, BERTje, RobBERT, multilingual BERT, GEITje and Mistral-7B exhibit substantial bias across the various bias categories. Using the English and French versions of the CrowS-Pairs dataset, bias was evaluated in English (BERT and RoBERTa) and French (FlauBERT and CamemBERT) language models, and it was shown that English models exhibit the most bias, whereas Dutch models the least amount of bias. Additionally, results also indicate that assigning a persona to a language model changes the level of bias it exhibits. These findings highlight the variability of bias across languages and contexts, suggesting that cultural and linguistic factors play a significant role in shaping model biases.

* 10 pages, accepted at RANLP 2025 data and code here: https://github.com/jerryspan/Dutch-CrowS-Pairs

Via

Access Paper or Ask Questions

Computational Studies in Influencer Marketing: A Systematic Literature Review

Jun 17, 2025

Haoyang Gui, Thales Bertaglia, Catalina Goanta, Gerasimos Spanakis

Abstract:Influencer marketing has become a crucial feature of digital marketing strategies. Despite its rapid growth and algorithmic relevance, the field of computational studies in influencer marketing remains fragmented, especially with limited systematic reviews covering the computational methodologies employed. This makes overarching scientific measurements in the influencer economy very scarce, to the detriment of interested stakeholders outside of platforms themselves, such as regulators, but also researchers from other fields. This paper aims to provide an overview of the state of the art of computational studies in influencer marketing by conducting a systematic literature review (SLR) based on the PRISMA model. The paper analyses 69 studies to identify key research themes, methodologies, and future directions in this research field. The review identifies four major research themes: Influencer identification and characterisation, Advertising strategies and engagement, Sponsored content analysis and discovery, and Fairness. Methodologically, the studies are categorised into machine learning-based techniques (e.g., classification, clustering) and non-machine-learning-based techniques (e.g., statistical analysis, network analysis). Key findings reveal a strong focus on optimising commercial outcomes, with limited attention to regulatory compliance and ethical considerations. The review highlights the need for more nuanced computational research that incorporates contextual factors such as language, platform, and industry type, as well as improved model explainability and dataset reproducibility. The paper concludes by proposing a multidisciplinary research agenda that emphasises the need for further links to regulation and compliance technology, finer granularity in analysis, and the development of standardised datasets.

* journal submission, under review

Via

Access Paper or Ask Questions

A Representation Level Analysis of NMT Model Robustness to Grammatical Errors

May 27, 2025

Abderrahmane Issam, Yusuf Can Semerci, Jan Scholtes, Gerasimos Spanakis

Figure 1 for A Representation Level Analysis of NMT Model Robustness to Grammatical Errors

Figure 2 for A Representation Level Analysis of NMT Model Robustness to Grammatical Errors

Figure 3 for A Representation Level Analysis of NMT Model Robustness to Grammatical Errors

Figure 4 for A Representation Level Analysis of NMT Model Robustness to Grammatical Errors

Abstract:Understanding robustness is essential for building reliable NLP systems. Unfortunately, in the context of machine translation, previous work mainly focused on documenting robustness failures or improving robustness. In contrast, we study robustness from a model representation perspective by looking at internal model representations of ungrammatical inputs and how they evolve through model layers. For this purpose, we perform Grammatical Error Detection (GED) probing and representational similarity analysis. Our findings indicate that the encoder first detects the grammatical error, then corrects it by moving its representation toward the correct form. To understand what contributes to this process, we turn to the attention mechanism where we identify what we term Robustness Heads. We find that Robustness Heads attend to interpretable linguistic units when responding to grammatical errors, and that when we fine-tune models for robustness, they tend to rely more on Robustness Heads for updating the ungrammatical word representation.

* ACL 2025 Findings

Via

Access Paper or Ask Questions

MATCHED: Multimodal Authorship-Attribution To Combat Human Trafficking in Escort-Advertisement Data

Dec 18, 2024

Vageesh Saxena, Benjamin Bashpole, Gijs Van Dijck, Gerasimos Spanakis

Figure 1 for MATCHED: Multimodal Authorship-Attribution To Combat Human Trafficking in Escort-Advertisement Data

Figure 2 for MATCHED: Multimodal Authorship-Attribution To Combat Human Trafficking in Escort-Advertisement Data

Figure 3 for MATCHED: Multimodal Authorship-Attribution To Combat Human Trafficking in Escort-Advertisement Data

Figure 4 for MATCHED: Multimodal Authorship-Attribution To Combat Human Trafficking in Escort-Advertisement Data

Abstract:Human trafficking (HT) remains a critical issue, with traffickers increasingly leveraging online escort advertisements (ads) to advertise victims anonymously. Existing detection methods, including Authorship Attribution (AA), often center on text-based analyses and neglect the multimodal nature of online escort ads, which typically pair text with images. To address this gap, we introduce MATCHED, a multimodal dataset of 27,619 unique text descriptions and 55,115 unique images collected from the Backpage escort platform across seven U.S. cities in four geographical regions. Our study extensively benchmarks text-only, vision-only, and multimodal baselines for vendor identification and verification tasks, employing multitask (joint) training objectives that achieve superior classification and retrieval performance on in-distribution and out-of-distribution (OOD) datasets. Integrating multimodal features further enhances this performance, capturing complementary patterns across text and images. While text remains the dominant modality, visual data adds stylistic cues that enrich model performance. Moreover, text-image alignment strategies like CLIP and BLIP2 struggle due to low semantic overlap and vague connections between the modalities of escort ads, with end-to-end multimodal training proving more robust. Our findings emphasize the potential of multimodal AA (MAA) to combat HT, providing LEAs with robust tools to link ads and disrupt trafficking networks.

* 40 pages

Via

Access Paper or Ask Questions

Analyzing the Attention Heads for Pronoun Disambiguation in Context-aware Machine Translation Models

Dec 15, 2024

Paweł Mąka, Yusuf Can Semerci, Jan Scholtes, Gerasimos Spanakis

Figure 1 for Analyzing the Attention Heads for Pronoun Disambiguation in Context-aware Machine Translation Models

Figure 2 for Analyzing the Attention Heads for Pronoun Disambiguation in Context-aware Machine Translation Models

Figure 3 for Analyzing the Attention Heads for Pronoun Disambiguation in Context-aware Machine Translation Models

Figure 4 for Analyzing the Attention Heads for Pronoun Disambiguation in Context-aware Machine Translation Models

Abstract:In this paper, we investigate the role of attention heads in Context-aware Machine Translation models for pronoun disambiguation in the English-to-German and English-to-French language directions. We analyze their influence by both observing and modifying the attention scores corresponding to the plausible relations that could impact a pronoun prediction. Our findings reveal that while some heads do attend the relations of interest, not all of them influence the models' ability to disambiguate pronouns. We show that certain heads are underutilized by the models, suggesting that model performance could be improved if only the heads would attend one of the relations more strongly. Furthermore, we fine-tune the most promising heads and observe the increase in pronoun disambiguation accuracy of up to 5 percentage points which demonstrates that the improvements in performance can be solidified into the models' parameters.

* COLING 2025

Via

Access Paper or Ask Questions