Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Virginia K. Felkner

GPT is Not an Annotator: The Necessity of Human Annotation in Fairness Benchmark Construction

May 24, 2024

Virginia K. Felkner, Jennifer A. Thompson, Jonathan May

Figure 1 for GPT is Not an Annotator: The Necessity of Human Annotation in Fairness Benchmark Construction

Figure 2 for GPT is Not an Annotator: The Necessity of Human Annotation in Fairness Benchmark Construction

Figure 3 for GPT is Not an Annotator: The Necessity of Human Annotation in Fairness Benchmark Construction

Figure 4 for GPT is Not an Annotator: The Necessity of Human Annotation in Fairness Benchmark Construction

Abstract:Social biases in LLMs are usually measured via bias benchmark datasets. Current benchmarks have limitations in scope, grounding, quality, and human effort required. Previous work has shown success with a community-sourced, rather than crowd-sourced, approach to benchmark development. However, this work still required considerable effort from annotators with relevant lived experience. This paper explores whether an LLM (specifically, GPT-3.5-Turbo) can assist with the task of developing a bias benchmark dataset from responses to an open-ended community survey. We also extend the previous work to a new community and set of biases: the Jewish community and antisemitism. Our analysis shows that GPT-3.5-Turbo has poor performance on this annotation task and produces unacceptable quality issues in its output. Thus, we conclude that GPT-3.5-Turbo is not an appropriate substitute for human annotation in sensitive tasks related to social biases, and that its use actually negates many of the benefits of community-sourcing bias benchmarks.

* Accepted to ACL 2024 (main conference)

Via

Access Paper or Ask Questions

WinoQueer: A Community-in-the-Loop Benchmark for Anti-LGBTQ+ Bias in Large Language Models

Jun 26, 2023

Virginia K. Felkner, Ho-Chun Herbert Chang, Eugene Jang, Jonathan May

Figure 1 for WinoQueer: A Community-in-the-Loop Benchmark for Anti-LGBTQ+ Bias in Large Language Models

Figure 2 for WinoQueer: A Community-in-the-Loop Benchmark for Anti-LGBTQ+ Bias in Large Language Models

Figure 3 for WinoQueer: A Community-in-the-Loop Benchmark for Anti-LGBTQ+ Bias in Large Language Models

Figure 4 for WinoQueer: A Community-in-the-Loop Benchmark for Anti-LGBTQ+ Bias in Large Language Models

Abstract:We present WinoQueer: a benchmark specifically designed to measure whether large language models (LLMs) encode biases that are harmful to the LGBTQ+ community. The benchmark is community-sourced, via application of a novel method that generates a bias benchmark from a community survey. We apply our benchmark to several popular LLMs and find that off-the-shelf models generally do exhibit considerable anti-queer bias. Finally, we show that LLM bias against a marginalized community can be somewhat mitigated by finetuning on data written about or by members of that community, and that social media text written by community members is more effective than news text written about the community by non-members. Our method for community-in-the-loop benchmark development provides a blueprint for future researchers to develop community-driven, harms-grounded LLM benchmarks for other marginalized communities.

* Accepted to ACL 2023 (main conference). Camera-ready version

Via

Access Paper or Ask Questions

Towards WinoQueer: Developing a Benchmark for Anti-Queer Bias in Large Language Models

Jun 23, 2022

Virginia K. Felkner, Ho-Chun Herbert Chang, Eugene Jang, Jonathan May

Figure 1 for Towards WinoQueer: Developing a Benchmark for Anti-Queer Bias in Large Language Models

Figure 2 for Towards WinoQueer: Developing a Benchmark for Anti-Queer Bias in Large Language Models

Figure 3 for Towards WinoQueer: Developing a Benchmark for Anti-Queer Bias in Large Language Models

Figure 4 for Towards WinoQueer: Developing a Benchmark for Anti-Queer Bias in Large Language Models

Abstract:This paper presents exploratory work on whether and to what extent biases against queer and trans people are encoded in large language models (LLMs) such as BERT. We also propose a method for reducing these biases in downstream tasks: finetuning the models on data written by and/or about queer people. To measure anti-queer bias, we introduce a new benchmark dataset, WinoQueer, modeled after other bias-detection benchmarks but addressing homophobic and transphobic biases. We found that BERT shows significant homophobic bias, but this bias can be mostly mitigated by finetuning BERT on a natural language corpus written by members of the LGBTQ+ community.

* Accepted to Queer in AI Workshop @ NAACL 2022

Via

Access Paper or Ask Questions