Stereoset


Contextual StereoSet: Stress-Testing Bias Alignment Robustness in Large Language Models

Add code
Jan 15, 2026
Viaarxiv icon

C2PO: Diagnosing and Disentangling Bias Shortcuts in LLMs

Add code
Dec 29, 2025
Viaarxiv icon

A Comprehensive Study of Implicit and Explicit Biases in Large Language Models

Add code
Nov 18, 2025
Viaarxiv icon

Augmenting Bias Detection in LLMs Using Topological Data Analysis

Add code
Aug 11, 2025
Viaarxiv icon

Semantic and Structural Analysis of Implicit Biases in Large Language Models: An Interpretable Approach

Add code
Aug 08, 2025
Viaarxiv icon

Detecting Stereotypes and Anti-stereotypes the Correct Way Using Social Psychological Underpinnings

Add code
Apr 04, 2025
Viaarxiv icon

Rethinking Prompt-based Debiasing in Large Language Models

Add code
Mar 12, 2025
Figure 1 for Rethinking Prompt-based Debiasing in Large Language Models
Figure 2 for Rethinking Prompt-based Debiasing in Large Language Models
Figure 3 for Rethinking Prompt-based Debiasing in Large Language Models
Figure 4 for Rethinking Prompt-based Debiasing in Large Language Models
Viaarxiv icon

BiasEdit: Debiasing Stereotyped Language Models via Model Editing

Add code
Mar 11, 2025
Viaarxiv icon

LLMs are Vulnerable to Malicious Prompts Disguised as Scientific Language

Add code
Jan 23, 2025
Viaarxiv icon

Mitigating Social Bias in Large Language Models: A Multi-Objective Approach within a Multi-Agent Framework

Add code
Dec 20, 2024
Figure 1 for Mitigating Social Bias in Large Language Models: A Multi-Objective Approach within a Multi-Agent Framework
Figure 2 for Mitigating Social Bias in Large Language Models: A Multi-Objective Approach within a Multi-Agent Framework
Figure 3 for Mitigating Social Bias in Large Language Models: A Multi-Objective Approach within a Multi-Agent Framework
Figure 4 for Mitigating Social Bias in Large Language Models: A Multi-Objective Approach within a Multi-Agent Framework
Viaarxiv icon