Picture for Adina Williams

Adina Williams

Meta AI

FACTORY: A Challenging Human-Verified Prompt Set for Long-Form Factuality

Add code
Jul 31, 2025
Viaarxiv icon

IntPhys 2: Benchmarking Intuitive Physics Understanding In Complex Synthetic Environments

Add code
Jun 11, 2025
Viaarxiv icon

Arbiters of Ambivalence: Challenges of Using LLMs in No-Consensus Tasks

Add code
May 28, 2025
Viaarxiv icon

Do different prompting methods yield a common task representation in language models?

Add code
May 17, 2025
Viaarxiv icon

Domain Regeneration: How well do LLMs match syntactic properties of text domains?

Add code
May 12, 2025
Viaarxiv icon

Findings of the BabyLM Challenge: Sample-Efficient Pretraining on Developmentally Plausible Corpora

Add code
Apr 10, 2025
Figure 1 for Findings of the BabyLM Challenge: Sample-Efficient Pretraining on Developmentally Plausible Corpora
Figure 2 for Findings of the BabyLM Challenge: Sample-Efficient Pretraining on Developmentally Plausible Corpora
Figure 3 for Findings of the BabyLM Challenge: Sample-Efficient Pretraining on Developmentally Plausible Corpora
Figure 4 for Findings of the BabyLM Challenge: Sample-Efficient Pretraining on Developmentally Plausible Corpora
Viaarxiv icon

Chained Tuning Leads to Biased Forgetting

Add code
Dec 21, 2024
Figure 1 for Chained Tuning Leads to Biased Forgetting
Figure 2 for Chained Tuning Leads to Biased Forgetting
Figure 3 for Chained Tuning Leads to Biased Forgetting
Figure 4 for Chained Tuning Leads to Biased Forgetting
Viaarxiv icon

What makes a good metric? Evaluating automatic metrics for text-to-image consistency

Add code
Dec 18, 2024
Figure 1 for What makes a good metric? Evaluating automatic metrics for text-to-image consistency
Figure 2 for What makes a good metric? Evaluating automatic metrics for text-to-image consistency
Figure 3 for What makes a good metric? Evaluating automatic metrics for text-to-image consistency
Figure 4 for What makes a good metric? Evaluating automatic metrics for text-to-image consistency
Viaarxiv icon

Findings of the Second BabyLM Challenge: Sample-Efficient Pretraining on Developmentally Plausible Corpora

Add code
Dec 06, 2024
Viaarxiv icon

Transformers Can Navigate Mazes With Multi-Step Prediction

Add code
Dec 06, 2024
Figure 1 for Transformers Can Navigate Mazes With Multi-Step Prediction
Figure 2 for Transformers Can Navigate Mazes With Multi-Step Prediction
Figure 3 for Transformers Can Navigate Mazes With Multi-Step Prediction
Figure 4 for Transformers Can Navigate Mazes With Multi-Step Prediction
Viaarxiv icon