Picture for Eve Fleisig

Eve Fleisig

PluriHarms: Benchmarking the Full Spectrum of Human Judgments on AI Harm

Add code
Jan 13, 2026
Viaarxiv icon

Balancing Quality and Variation: Spam Filtering Distorts Data Label Distributions

Add code
Sep 10, 2025
Figure 1 for Balancing Quality and Variation: Spam Filtering Distorts Data Label Distributions
Figure 2 for Balancing Quality and Variation: Spam Filtering Distorts Data Label Distributions
Figure 3 for Balancing Quality and Variation: Spam Filtering Distorts Data Label Distributions
Figure 4 for Balancing Quality and Variation: Spam Filtering Distorts Data Label Distributions
Viaarxiv icon

GRACE: A Granular Benchmark for Evaluating Model Calibration against Human Calibration

Add code
Feb 27, 2025
Figure 1 for GRACE: A Granular Benchmark for Evaluating Model Calibration against Human Calibration
Figure 2 for GRACE: A Granular Benchmark for Evaluating Model Calibration against Human Calibration
Figure 3 for GRACE: A Granular Benchmark for Evaluating Model Calibration against Human Calibration
Figure 4 for GRACE: A Granular Benchmark for Evaluating Model Calibration against Human Calibration
Viaarxiv icon

Accurate and Data-Efficient Toxicity Prediction when Annotators Disagree

Add code
Oct 16, 2024
Figure 1 for Accurate and Data-Efficient Toxicity Prediction when Annotators Disagree
Figure 2 for Accurate and Data-Efficient Toxicity Prediction when Annotators Disagree
Figure 3 for Accurate and Data-Efficient Toxicity Prediction when Annotators Disagree
Figure 4 for Accurate and Data-Efficient Toxicity Prediction when Annotators Disagree
Viaarxiv icon

ADVSCORE: A Metric for the Evaluation and Creation of Adversarial Benchmarks

Add code
Jun 24, 2024
Figure 1 for ADVSCORE: A Metric for the Evaluation and Creation of Adversarial Benchmarks
Figure 2 for ADVSCORE: A Metric for the Evaluation and Creation of Adversarial Benchmarks
Figure 3 for ADVSCORE: A Metric for the Evaluation and Creation of Adversarial Benchmarks
Figure 4 for ADVSCORE: A Metric for the Evaluation and Creation of Adversarial Benchmarks
Viaarxiv icon

Linguistic Bias in ChatGPT: Language Models Reinforce Dialect Discrimination

Add code
Jun 13, 2024
Figure 1 for Linguistic Bias in ChatGPT: Language Models Reinforce Dialect Discrimination
Figure 2 for Linguistic Bias in ChatGPT: Language Models Reinforce Dialect Discrimination
Figure 3 for Linguistic Bias in ChatGPT: Language Models Reinforce Dialect Discrimination
Figure 4 for Linguistic Bias in ChatGPT: Language Models Reinforce Dialect Discrimination
Viaarxiv icon

Standard Language Ideology in AI-Generated Language

Add code
Jun 13, 2024
Figure 1 for Standard Language Ideology in AI-Generated Language
Viaarxiv icon

The Perspectivist Paradigm Shift: Assumptions and Challenges of Capturing Human Labels

Add code
May 09, 2024
Viaarxiv icon

Mapping Social Choice Theory to RLHF

Add code
Apr 19, 2024
Figure 1 for Mapping Social Choice Theory to RLHF
Viaarxiv icon

Incorporating Worker Perspectives into MTurk Annotation Practices for NLP

Add code
Nov 16, 2023
Figure 1 for Incorporating Worker Perspectives into MTurk Annotation Practices for NLP
Figure 2 for Incorporating Worker Perspectives into MTurk Annotation Practices for NLP
Figure 3 for Incorporating Worker Perspectives into MTurk Annotation Practices for NLP
Figure 4 for Incorporating Worker Perspectives into MTurk Annotation Practices for NLP
Viaarxiv icon