Picture for Ben Bogin

Ben Bogin

Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order

Add code
Mar 30, 2024
Viaarxiv icon

Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research

Add code
Jan 31, 2024
Figure 1 for Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research
Figure 2 for Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research
Figure 3 for Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research
Figure 4 for Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research
Viaarxiv icon

Leveraging Code to Improve In-context Learning for Semantic Parsing

Add code
Nov 16, 2023
Figure 1 for Leveraging Code to Improve In-context Learning for Semantic Parsing
Figure 2 for Leveraging Code to Improve In-context Learning for Semantic Parsing
Figure 3 for Leveraging Code to Improve In-context Learning for Semantic Parsing
Figure 4 for Leveraging Code to Improve In-context Learning for Semantic Parsing
Viaarxiv icon

Answering Questions by Meta-Reasoning over Multiple Chains of Thought

Add code
Apr 25, 2023
Figure 1 for Answering Questions by Meta-Reasoning over Multiple Chains of Thought
Figure 2 for Answering Questions by Meta-Reasoning over Multiple Chains of Thought
Figure 3 for Answering Questions by Meta-Reasoning over Multiple Chains of Thought
Figure 4 for Answering Questions by Meta-Reasoning over Multiple Chains of Thought
Viaarxiv icon

Diverse Demonstrations Improve In-context Compositional Generalization

Add code
Dec 20, 2022
Figure 1 for Diverse Demonstrations Improve In-context Compositional Generalization
Figure 2 for Diverse Demonstrations Improve In-context Compositional Generalization
Figure 3 for Diverse Demonstrations Improve In-context Compositional Generalization
Figure 4 for Diverse Demonstrations Improve In-context Compositional Generalization
Viaarxiv icon

Training Vision-Language Models with Less Bimodal Supervision

Add code
Nov 01, 2022
Figure 1 for Training Vision-Language Models with Less Bimodal Supervision
Figure 2 for Training Vision-Language Models with Less Bimodal Supervision
Figure 3 for Training Vision-Language Models with Less Bimodal Supervision
Figure 4 for Training Vision-Language Models with Less Bimodal Supervision
Viaarxiv icon

Unobserved Local Structures Make Compositional Generalization Hard

Add code
Jan 15, 2022
Figure 1 for Unobserved Local Structures Make Compositional Generalization Hard
Figure 2 for Unobserved Local Structures Make Compositional Generalization Hard
Figure 3 for Unobserved Local Structures Make Compositional Generalization Hard
Figure 4 for Unobserved Local Structures Make Compositional Generalization Hard
Viaarxiv icon

COVR: A test-bed for Visually Grounded Compositional Generalization with real images

Add code
Sep 22, 2021
Figure 1 for COVR: A test-bed for Visually Grounded Compositional Generalization with real images
Figure 2 for COVR: A test-bed for Visually Grounded Compositional Generalization with real images
Figure 3 for COVR: A test-bed for Visually Grounded Compositional Generalization with real images
Figure 4 for COVR: A test-bed for Visually Grounded Compositional Generalization with real images
Viaarxiv icon

Text-to-SQL in the Wild: A Naturally-Occurring Dataset Based on Stack Exchange Data

Add code
Jun 09, 2021
Figure 1 for Text-to-SQL in the Wild: A Naturally-Occurring Dataset Based on Stack Exchange Data
Figure 2 for Text-to-SQL in the Wild: A Naturally-Occurring Dataset Based on Stack Exchange Data
Figure 3 for Text-to-SQL in the Wild: A Naturally-Occurring Dataset Based on Stack Exchange Data
Figure 4 for Text-to-SQL in the Wild: A Naturally-Occurring Dataset Based on Stack Exchange Data
Viaarxiv icon

MedICaT: A Dataset of Medical Images, Captions, and Textual References

Add code
Oct 12, 2020
Figure 1 for MedICaT: A Dataset of Medical Images, Captions, and Textual References
Figure 2 for MedICaT: A Dataset of Medical Images, Captions, and Textual References
Figure 3 for MedICaT: A Dataset of Medical Images, Captions, and Textual References
Figure 4 for MedICaT: A Dataset of Medical Images, Captions, and Textual References
Viaarxiv icon