Picture for Adina Williams

Adina Williams

Meta AI

Towards Geographic Inclusion in the Evaluation of Text-to-Image Models

Add code
May 07, 2024
Figure 1 for Towards Geographic Inclusion in the Evaluation of Text-to-Image Models
Figure 2 for Towards Geographic Inclusion in the Evaluation of Text-to-Image Models
Figure 3 for Towards Geographic Inclusion in the Evaluation of Text-to-Image Models
Figure 4 for Towards Geographic Inclusion in the Evaluation of Text-to-Image Models
Viaarxiv icon

The PRISM Alignment Project: What Participatory, Representative and Individualised Human Feedback Reveals About the Subjective and Multicultural Alignment of Large Language Models

Add code
Apr 24, 2024
Figure 1 for The PRISM Alignment Project: What Participatory, Representative and Individualised Human Feedback Reveals About the Subjective and Multicultural Alignment of Large Language Models
Figure 2 for The PRISM Alignment Project: What Participatory, Representative and Individualised Human Feedback Reveals About the Subjective and Multicultural Alignment of Large Language Models
Figure 3 for The PRISM Alignment Project: What Participatory, Representative and Individualised Human Feedback Reveals About the Subjective and Multicultural Alignment of Large Language Models
Figure 4 for The PRISM Alignment Project: What Participatory, Representative and Individualised Human Feedback Reveals About the Subjective and Multicultural Alignment of Large Language Models
Viaarxiv icon

Introducing v0.5 of the AI Safety Benchmark from MLCommons

Add code
Apr 18, 2024
Figure 1 for Introducing v0.5 of the AI Safety Benchmark from MLCommons
Figure 2 for Introducing v0.5 of the AI Safety Benchmark from MLCommons
Figure 3 for Introducing v0.5 of the AI Safety Benchmark from MLCommons
Figure 4 for Introducing v0.5 of the AI Safety Benchmark from MLCommons
Viaarxiv icon

[Call for Papers] The 2nd BabyLM Challenge: Sample-efficient pretraining on a developmentally plausible corpus

Add code
Apr 09, 2024
Figure 1 for [Call for Papers] The 2nd BabyLM Challenge: Sample-efficient pretraining on a developmentally plausible corpus
Viaarxiv icon

Improving Text-to-Image Consistency via Automatic Prompt Optimization

Add code
Mar 26, 2024
Figure 1 for Improving Text-to-Image Consistency via Automatic Prompt Optimization
Figure 2 for Improving Text-to-Image Consistency via Automatic Prompt Optimization
Figure 3 for Improving Text-to-Image Consistency via Automatic Prompt Optimization
Figure 4 for Improving Text-to-Image Consistency via Automatic Prompt Optimization
Viaarxiv icon

Compositional learning of functions in humans and machines

Add code
Mar 18, 2024
Figure 1 for Compositional learning of functions in humans and machines
Figure 2 for Compositional learning of functions in humans and machines
Figure 3 for Compositional learning of functions in humans and machines
Figure 4 for Compositional learning of functions in humans and machines
Viaarxiv icon

EmphAssess : a Prosodic Benchmark on Assessing Emphasis Transfer in Speech-to-Speech Models

Add code
Dec 21, 2023
Viaarxiv icon

Grammatical Gender's Influence on Distributional Semantics: A Causal Perspective

Add code
Nov 30, 2023
Viaarxiv icon

ROBBIE: Robust Bias Evaluation of Large Generative Language Models

Add code
Nov 29, 2023
Figure 1 for ROBBIE: Robust Bias Evaluation of Large Generative Language Models
Figure 2 for ROBBIE: Robust Bias Evaluation of Large Generative Language Models
Figure 3 for ROBBIE: Robust Bias Evaluation of Large Generative Language Models
Figure 4 for ROBBIE: Robust Bias Evaluation of Large Generative Language Models
Viaarxiv icon

The Validity of Evaluation Results: Assessing Concurrence Across Compositionality Benchmarks

Add code
Oct 26, 2023
Figure 1 for The Validity of Evaluation Results: Assessing Concurrence Across Compositionality Benchmarks
Figure 2 for The Validity of Evaluation Results: Assessing Concurrence Across Compositionality Benchmarks
Figure 3 for The Validity of Evaluation Results: Assessing Concurrence Across Compositionality Benchmarks
Figure 4 for The Validity of Evaluation Results: Assessing Concurrence Across Compositionality Benchmarks
Viaarxiv icon