Abstract:Implicit bias refers to automatic or spontaneous mental processes that shape perceptions, judgments, and behaviors. Previous research examining `implicit bias' in large language models (LLMs) has often approached the phenomenon differently than how it is studied in humans by focusing primarily on model outputs rather than on model processing. To examine model processing, we present a method called the Reasoning Model Implicit Association Test (RM-IAT) for studying implicit bias-like patterns in reasoning models: LLMs that employ step-by-step reasoning to solve complex tasks. Using this method, we find that reasoning models require more tokens when processing association-incompatible information compared to association-compatible information. These findings suggest AI systems harbor patterns in processing information that are analogous to human implicit bias. We consider the implications of these implicit bias-like patterns for their deployment in real-world applications.
Abstract:Current research on bias in Vision Language Models (VLMs) has important limitations: it is focused exclusively on trait associations while ignoring other forms of stereotyping, it examines specific contexts where biases are expected to appear, and it conceptualizes social categories like race and gender as binary, ignoring the multifaceted nature of these identities. Using standardized facial images that vary in prototypicality, we test four VLMs for both trait associations and homogeneity bias in open-ended contexts. We find that VLMs consistently generate more uniform stories for women compared to men, with people who are more gender prototypical in appearance being represented more uniformly. By contrast, VLMs represent White Americans more uniformly than Black Americans. Unlike with gender prototypicality, race prototypicality was not related to stronger uniformity. In terms of trait associations, we find limited evidence of stereotyping-Black Americans were consistently linked with basketball across all models, while other racial associations (i.e., art, healthcare, appearance) varied by specific VLM. These findings demonstrate that VLM stereotyping manifests in ways that go beyond simple group membership, suggesting that conventional bias mitigation strategies may be insufficient to address VLM stereotyping and that homogeneity bias persists even when trait associations are less apparent in model outputs.
Abstract:Prior research show that Large Language Models (LLMs) and Vision-Language Models (VLMs) represent marginalized groups more homogeneously than dominant groups. However, the mechanisms underlying this homogeneity bias remain relatively unexplored. We propose that this bias emerges from systematic differences in the probability distributions from which tokens are sampled at inference-time. Analyzing three measures of uncertainty in token sampling distributions-entropy, perplexity, and probability of differentiation-we find that in some models, specifically GPT-4 Turbo and Llama-3.2, tokens are sampled more deterministically when generating texts about marginalized groups (i.e., Black Americans and women) compared to their dominant group counterparts (i.e., White Americans and men). While these findings may help explain homogeneity bias in certain models, the patterns did not replicate across all VLMs tested, suggesting multiple mechanisms may contribute to homogeneity bias in AI.
Abstract:Vision-Language Models trained on massive collections of human-generated data often reproduce and amplify societal stereotypes. One critical form of stereotyping reproduced by these models is homogeneity bias-the tendency to represent certain groups as more homogeneous than others. We investigate how this bias responds to hyperparameter adjustments in GPT-4, specifically examining sampling temperature and top p which control the randomness of model outputs. By generating stories about individuals from different racial and gender groups and comparing their similarities using vector representations, we assess both bias robustness and its relationship with hyperparameter values. We find that (1) homogeneity bias persists across most hyperparameter configurations, with Black Americans and women being represented more homogeneously than White Americans and men, (2) the relationship between hyperparameters and group representations shows unexpected non-linear patterns, particularly at extreme values, and (3) hyperparameter adjustments affect racial and gender homogeneity bias differently-while increasing temperature or decreasing top p can reduce racial homogeneity bias, these changes show different effects on gender homogeneity bias. Our findings suggest that while hyperparameter tuning may mitigate certain biases to some extent, it cannot serve as a universal solution for addressing homogeneity bias across different social group dimensions.
Abstract:Vision-Language Models (VLMs) combine Large Language Model (LLM) capabilities with image processing, enabling tasks like image captioning and text-to-image generation. Yet concerns persist about their potential to amplify human-like biases, including skin tone bias. Skin tone bias, where darker-skinned individuals face more negative stereotyping than lighter-skinned individuals, is well-documented in the social sciences but remains under-explored in Artificial Intelligence (AI), particularly in VLMs. While well-documented in the social sciences, this bias remains under-explored in AI, particularly in VLMs. Using the GAN Face Database, we sampled computer-generated images of Black American men and women, controlling for skin tone variations while keeping other features constant. We then asked VLMs to write stories about these faces and compared the homogeneity of the generated stories. Stories generated by VLMs about darker-skinned Black individuals were more homogeneous than those about lighter-skinned individuals in three of four models, and Black women were consistently represented more homogeneously than Black men across all models. Interaction effects revealed a greater impact of skin tone on women in two VLMs, while the other two showed nonsignificant results, reflecting known stereotyping patterns. These findings underscore the propagation of biases from single-modality AI systems to multimodal models and highlight the need for further research to address intersectional biases in AI.
Abstract:Homogeneity bias in Large Language Models (LLMs) refers to their tendency to homogenize the representations of some groups compared to others. Previous studies documenting this bias have predominantly used encoder models, which may have inadvertently introduced biases. To address this limitation, we prompted GPT-4 to generate single word/expression completions associated with 18 situation cues - specific, measurable elements of environments that influence how individuals perceive situations and compared the variability of these completions using probability of differentiation. This approach directly assessed homogeneity bias from the model's outputs, bypassing encoder models. Across five studies, we find that homogeneity bias is highly volatile across situation cues and writing prompts, suggesting that the bias observed in past work may reflect those within encoder models rather than LLMs. Furthermore, these results suggest that homogeneity bias in LLMs is brittle, as even minor and arbitrary changes in prompts can significantly alter the expression of biases. Future work should further explore how variations in syntactic features and topic choices in longer text generations influence homogeneity bias in LLMs.
Abstract:Large Language Models (LLMs) have become pervasive in everyday life, yet their inner workings remain opaque. While scholarly efforts have demonstrated LLMs' propensity to reproduce biases in their training data, they have primarily focused on the association of social groups with stereotypic attributes. In this paper, we extend this line of inquiry to investigate a bias akin to the social-psychological phenomenon where socially dominant groups are perceived to be less homogeneous than socially subordinate groups as it is reproduced by LLMs. We had ChatGPT, a state-of-the-art LLM, generate a diversity of texts about intersectional group identities and compared text homogeneity. We consistently find that LLMs portray African, Asian, and Hispanic Americans as more homogeneous than White Americans. They also portray women as more homogeneous than men, but these differences are small. Finally, we find that the effect of gender differs across racial/ethnic groups such that the effect of gender is consistent within African and Hispanic Americans but not within Asian and White Americans. We speculate possible sources of this bias in LLMs and posit that the bias has the potential to amplify biases in future LLM training and to reinforce stereotypes.