Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Lake Yin

DIF: A Framework for Benchmarking and Verifying Implicit Bias in LLMs

May 15, 2025

Lake Yin, Fan Huang

Figure 1 for DIF: A Framework for Benchmarking and Verifying Implicit Bias in LLMs

Figure 2 for DIF: A Framework for Benchmarking and Verifying Implicit Bias in LLMs

Figure 3 for DIF: A Framework for Benchmarking and Verifying Implicit Bias in LLMs

Figure 4 for DIF: A Framework for Benchmarking and Verifying Implicit Bias in LLMs

Abstract:As Large Language Models (LLMs) have risen in prominence over the past few years, there has been concern over the potential biases in LLMs inherited from the training data. Previous studies have examined how LLMs exhibit implicit bias, such as when response generation changes when different social contexts are introduced. We argue that this implicit bias is not only an ethical, but also a technical issue, as it reveals an inability of LLMs to accommodate extraneous information. However, unlike other measures of LLM intelligence, there are no standard methods to benchmark this specific subset of LLM bias. To bridge this gap, we developed a method for calculating an easily interpretable benchmark, DIF (Demographic Implicit Fairness), by evaluating preexisting LLM logic and math problem datasets with sociodemographic personas. We demonstrate that this method can statistically validate the presence of implicit bias in LLM behavior and find an inverse trend between question answering accuracy and implicit bias, supporting our argument.

* 7 pages, 1 figure

Via

Access Paper or Ask Questions

Analyzing Trendy Twitter Hashtags in the 2022 French Election

Oct 11, 2023

Aamir Mandviwalla, Lake Yin, Boleslaw K. Szymanski

Abstract:Regressions trained to predict the future activity of social media users need rich features for accurate predictions. Many advanced models exist to generate such features; however, the time complexities of their computations are often prohibitive when they run on enormous data-sets. Some studies have shown that simple semantic network features can be rich enough to use for regressions without requiring complex computations. We propose a method for using semantic networks as user-level features for machine learning tasks. We conducted an experiment using a semantic network of 1037 Twitter hashtags from a corpus of 3.7 million tweets related to the 2022 French presidential election. A bipartite graph is formed where hashtags are nodes and weighted edges connect the hashtags reflecting the number of Twitter users that interacted with both hashtags. The graph is then transformed into a maximum-spanning tree with the most popular hashtag as its root node to construct a hierarchy amongst the hashtags. We then provide a vector feature for each user based on this tree. To validate the usefulness of our semantic feature we performed a regression experiment to predict the response rate of each user with six emotions like anger, enjoyment, or disgust. Our semantic feature performs well with the regression with most emotions having $R^2$ above 0.5. These results suggest that our semantic feature could be considered for use in further experiments predicting social media response on big data-sets.

* 9 pages, 1 figure, to be published in Complex Networks 2023

Via

Access Paper or Ask Questions