Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Anne Lauscher

Local Contrastive Editing of Gender Stereotypes

Oct 23, 2024

Marlene Lutz, Rochelle Choenni, Markus Strohmaier, Anne Lauscher

Abstract:Stereotypical bias encoded in language models (LMs) poses a threat to safe language technology, yet our understanding of how bias manifests in the parameters of LMs remains incomplete. We introduce local contrastive editing that enables the localization and editing of a subset of weights in a target model in relation to a reference model. We deploy this approach to identify and modify subsets of weights that are associated with gender stereotypes in LMs. Through a series of experiments, we demonstrate that local contrastive editing can precisely localize and control a small subset (< 0.5%) of weights that encode gender bias. Our work (i) advances our understanding of how stereotypical biases can manifest in the parameter space of LMs and (ii) opens up new avenues for developing parameter-efficient strategies for controlling model properties in a contrastive manner.

* Accepted at EMNLP 2024

Via

Access Paper or Ask Questions

SoK: Towards Security and Safety of Edge AI

Oct 07, 2024

Tatjana Wingarz, Anne Lauscher, Janick Edinger, Dominik Kaaser, Stefan Schulte, Mathias Fischer

Figure 1 for SoK: Towards Security and Safety of Edge AI

Abstract:Advanced AI applications have become increasingly available to a broad audience, e.g., as centrally managed large language models (LLMs). Such centralization is both a risk and a performance bottleneck - Edge AI promises to be a solution to these problems. However, its decentralized approach raises additional challenges regarding security and safety. In this paper, we argue that both of these aspects are critical for Edge AI, and even more so, their integration. Concretely, we survey security and safety threats, summarize existing countermeasures, and collect open challenges as a call for more research in this area.

Via

Access Paper or Ask Questions

The Lou Dataset -- Exploring the Impact of Gender-Fair Language in German Text Classification

Sep 26, 2024

Andreas Waldis, Joel Birrer, Anne Lauscher, Iryna Gurevych

Figure 1 for The Lou Dataset -- Exploring the Impact of Gender-Fair Language in German Text Classification

Figure 2 for The Lou Dataset -- Exploring the Impact of Gender-Fair Language in German Text Classification

Figure 3 for The Lou Dataset -- Exploring the Impact of Gender-Fair Language in German Text Classification

Figure 4 for The Lou Dataset -- Exploring the Impact of Gender-Fair Language in German Text Classification

Abstract:Gender-fair language, an evolving German linguistic variation, fosters inclusion by addressing all genders or using neutral forms. Nevertheless, there is a significant lack of resources to assess the impact of this linguistic shift on classification using language models (LMs), which are probably not trained on such variations. To address this gap, we present Lou, the first dataset featuring high-quality reformulations for German text classification covering seven tasks, like stance detection and toxicity classification. Evaluating 16 mono- and multi-lingual LMs on Lou shows that gender-fair language substantially impacts predictions by flipping labels, reducing certainty, and altering attention patterns. However, existing evaluations remain valid, as LM rankings of original and reformulated instances do not significantly differ. While we offer initial insights on the effect on German text classification, the findings likely apply to other languages, as consistent patterns were observed in multi-lingual and English LMs.

Via

Access Paper or Ask Questions

Revisiting English Winogender Schemas for Consistency, Coverage, and Grammatical Case

Sep 09, 2024

Vagrant Gautam, Julius Steuer, Eileen Bingert, Ray Johns, Anne Lauscher, Dietrich Klakow

Figure 1 for Revisiting English Winogender Schemas for Consistency, Coverage, and Grammatical Case

Figure 2 for Revisiting English Winogender Schemas for Consistency, Coverage, and Grammatical Case

Figure 3 for Revisiting English Winogender Schemas for Consistency, Coverage, and Grammatical Case

Figure 4 for Revisiting English Winogender Schemas for Consistency, Coverage, and Grammatical Case

Abstract:While measuring bias and robustness in coreference resolution are important goals, such measurements are only as good as the tools we use to measure them with. Winogender schemas (Rudinger et al., 2018) are an influential dataset proposed to evaluate gender bias in coreference resolution, but a closer look at the data reveals issues with the instances that compromise their use for reliable evaluation, including treating different grammatical cases of pronouns in the same way, violations of template constraints, and typographical errors. We identify these issues and fix them, contributing a new dataset: Winogender 2.0. Our changes affect performance with state-of-the-art supervised coreference resolution systems as well as all model sizes of the language model FLAN-T5, with F1 dropping on average 0.1 points. We also propose a new method to evaluate pronominal bias in coreference resolution that goes beyond the binary. With this method and our new dataset which is balanced for grammatical case, we empirically demonstrate that bias characteristics vary not just across pronoun sets, but also across surface forms of those sets.

Via

Access Paper or Ask Questions

Decoding Multilingual Moral Preferences: Unveiling LLM's Biases Through the Moral Machine Experiment

Jul 21, 2024

Karina Vida, Fabian Damken, Anne Lauscher

Figure 1 for Decoding Multilingual Moral Preferences: Unveiling LLM's Biases Through the Moral Machine Experiment

Figure 2 for Decoding Multilingual Moral Preferences: Unveiling LLM's Biases Through the Moral Machine Experiment

Figure 3 for Decoding Multilingual Moral Preferences: Unveiling LLM's Biases Through the Moral Machine Experiment

Figure 4 for Decoding Multilingual Moral Preferences: Unveiling LLM's Biases Through the Moral Machine Experiment

Abstract:Large language models (LLMs) increasingly find their way into the most diverse areas of our everyday lives. They indirectly influence people's decisions or opinions through their daily use. Therefore, understanding how and which moral judgements these LLMs make is crucial. However, morality is not universal and depends on the cultural background. This raises the question of whether these cultural preferences are also reflected in LLMs when prompted in different languages or whether moral decision-making is consistent across different languages. So far, most research has focused on investigating the inherent values of LLMs in English. While a few works conduct multilingual analyses of moral bias in LLMs in a multilingual setting, these analyses do not go beyond atomic actions. To the best of our knowledge, a multilingual analysis of moral bias in dilemmas has not yet been conducted. To address this, our paper builds on the moral machine experiment (MME) to investigate the moral preferences of five LLMs, Falcon, Gemini, Llama, GPT, and MPT, in a multilingual setting and compares them with the preferences collected from humans belonging to different cultures. To accomplish this, we generate 6500 scenarios of the MME and prompt the models in ten languages on which action to take. Our analysis reveals that all LLMs inhibit different moral biases to some degree and that they not only differ from the human preferences but also across multiple languages within the models themselves. Moreover, we find that almost all models, particularly Llama 3, divert greatly from human values and, for instance, prefer saving fewer people over saving more.

* to be published in AIES 2024 Proceedings

Via

Access Paper or Ask Questions

Why do LLaVA Vision-Language Models Reply to Images in English?

Jul 02, 2024

Musashi Hinck, Carolin Holtermann, Matthew Lyle Olson, Florian Schneider, Sungduk Yu, Anahita Bhiwandiwalla, Anne Lauscher, Shaoyen Tseng, Vasudev Lal

Figure 1 for Why do LLaVA Vision-Language Models Reply to Images in English?

Figure 2 for Why do LLaVA Vision-Language Models Reply to Images in English?

Figure 3 for Why do LLaVA Vision-Language Models Reply to Images in English?

Figure 4 for Why do LLaVA Vision-Language Models Reply to Images in English?

Abstract:We uncover a surprising multilingual bias occurring in a popular class of multimodal vision-language models (VLMs). Including an image in the query to a LLaVA-style VLM significantly increases the likelihood of the model returning an English response, regardless of the language of the query. This paper investigates the causes of this loss with a two-pronged approach that combines extensive ablation of the design space with a mechanistic analysis of the models' internal representations of image and text inputs. Both approaches indicate that the issue stems in the language modelling component of the LLaVA model. Statistically, we find that switching the language backbone for a bilingual language model has the strongest effect on reducing this error. Mechanistically, we provide compelling evidence that visual inputs are not mapped to a similar space as text ones, and that intervening on intermediary attention layers can reduce this bias. Our findings provide important insights to researchers and engineers seeking to understand the crossover between multimodal and multilingual spaces, and contribute to the goal of developing capable and inclusive VLMs for non-English contexts.

* Pre-print

Via

Access Paper or Ask Questions

Building Bridges: A Dataset for Evaluating Gender-Fair Machine Translation into German

Jun 10, 2024

Manuel Lardelli, Giuseppe Attanasio, Anne Lauscher

Abstract:The translation of gender-neutral person-referring terms (e.g., the students) is often non-trivial. Translating from English into German poses an interesting case -- in German, person-referring nouns are usually gender-specific, and if the gender of the referent(s) is unknown or diverse, the generic masculine (die Studenten (m.)) is commonly used. This solution, however, reduces the visibility of other genders, such as women and non-binary people. To counteract gender discrimination, a societal movement towards using gender-fair language exists (e.g., by adopting neosystems). However, gender-fair German is currently barely supported in machine translation (MT), requiring post-editing or manual translations. We address this research gap by studying gender-fair language in English-to-German MT. Concretely, we enrich a community-created gender-fair language dictionary and sample multi-sentence test instances from encyclopedic text and parliamentary speeches. Using these novel resources, we conduct the first benchmark study involving two commercial systems and six neural MT models for translating words in isolation and natural contexts across two domains. Our findings show that most systems produce mainly masculine forms and rarely gender-neutral variants, highlighting the need for future research. We release code and data at https://github.com/g8a9/building-bridges-gender-fair-german-mt.

* Accepted to Findings of ACL 2024. Code and data at https://github.com/g8a9/building-bridges-gender-fair-german-mt

Via

Access Paper or Ask Questions

Stop! In the Name of Flaws: Disentangling Personal Names and Sociodemographic Attributes in NLP

May 27, 2024

Vagrant Gautam, Arjun Subramonian, Anne Lauscher, Os Keyes

Abstract:Personal names simultaneously differentiate individuals and categorize them in ways that are important in a given society. While the natural language processing community has thus associated personal names with sociodemographic characteristics in a variety of tasks, researchers have engaged to varying degrees with the established methodological problems in doing so. To guide future work, we present an interdisciplinary background on names and naming. We then survey the issues inherent to associating names with sociodemographic attributes, covering problems of validity (e.g., systematic error, construct validity), as well as ethical concerns (e.g., harms, differential impact, cultural insensitivity). Finally, we provide guiding questions along with normative recommendations to avoid validity and ethical pitfalls when dealing with names and sociodemographic characteristics in natural language processing.

Via

Access Paper or Ask Questions

The Echoes of Multilinguality: Tracing Cultural Value Shifts during LM Fine-tuning

May 21, 2024

Rochelle Choenni, Anne Lauscher, Ekaterina Shutova

Abstract:Texts written in different languages reflect different culturally-dependent beliefs of their writers. Thus, we expect multilingual LMs (MLMs), that are jointly trained on a concatenation of text in multiple languages, to encode different cultural values for each language. Yet, as the 'multilinguality' of these LMs is driven by cross-lingual sharing, we also have reason to belief that cultural values bleed over from one language into another. This limits the use of MLMs in practice, as apart from being proficient in generating text in multiple languages, creating language technology that can serve a community also requires the output of LMs to be sensitive to their biases (Naous et al., 2023). Yet, little is known about how cultural values emerge and evolve in MLMs (Hershcovich et al., 2022a). We are the first to study how languages can exert influence on the cultural values encoded for different test languages, by studying how such values are revised during fine-tuning. Focusing on the fine-tuning stage allows us to study the interplay between value shifts when exposed to new linguistic experience from different data sources and languages. Lastly, we use a training data attribution method to find patterns in the fine-tuning examples, and the languages that they come from, that tend to instigate value shifts.

Via

Access Paper or Ask Questions

Large Language Models for Human-Machine Collaborative Particle Accelerator Tuning through Natural Language

May 14, 2024

Jan Kaiser, Annika Eichler, Anne Lauscher

Abstract:Autonomous tuning of particle accelerators is an active and challenging field of research with the goal of enabling novel accelerator technologies cutting-edge high-impact applications, such as physics discovery, cancer research and material sciences. A key challenge with autonomous accelerator tuning remains that the most capable algorithms require an expert in optimisation, machine learning or a similar field to implement the algorithm for every new tuning task. In this work, we propose the use of large language models (LLMs) to tune particle accelerators. We demonstrate on a proof-of-principle example the ability of LLMs to successfully and autonomously tune a particle accelerator subsystem based on nothing more than a natural language prompt from the operator, and compare the performance of our LLM-based solution to state-of-the-art optimisation algorithms, such as Bayesian optimisation (BO) and reinforcement learning-trained optimisation (RLO). In doing so, we also show how LLMs can perform numerical optimisation of a highly non-linear real-world objective function. Ultimately, this work represents yet another complex task that LLMs are capable of solving and promises to help accelerate the deployment of autonomous tuning algorithms to the day-to-day operations of particle accelerators.

* 22 pages, 5 figures

Via

Access Paper or Ask Questions