Picture for Manan Dey

Manan Dey

StarCoder 2 and The Stack v2: The Next Generation

Add code
Feb 29, 2024
Viaarxiv icon

StarCoder: may the source be with you!

Add code
May 09, 2023
Figure 1 for StarCoder: may the source be with you!
Figure 2 for StarCoder: may the source be with you!
Figure 3 for StarCoder: may the source be with you!
Figure 4 for StarCoder: may the source be with you!
Viaarxiv icon

The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset

Add code
Mar 07, 2023
Figure 1 for The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset
Figure 2 for The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset
Figure 3 for The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset
Figure 4 for The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset
Viaarxiv icon

SantaCoder: don't reach for the stars!

Add code
Jan 09, 2023
Figure 1 for SantaCoder: don't reach for the stars!
Figure 2 for SantaCoder: don't reach for the stars!
Figure 3 for SantaCoder: don't reach for the stars!
Figure 4 for SantaCoder: don't reach for the stars!
Viaarxiv icon

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

Add code
Nov 09, 2022
Viaarxiv icon

How sensitive are translation systems to extra contexts? Mitigating gender bias in Neural Machine Translation models through relevant contexts

Add code
May 22, 2022
Figure 1 for How sensitive are translation systems to extra contexts? Mitigating gender bias in Neural Machine Translation models through relevant contexts
Figure 2 for How sensitive are translation systems to extra contexts? Mitigating gender bias in Neural Machine Translation models through relevant contexts
Figure 3 for How sensitive are translation systems to extra contexts? Mitigating gender bias in Neural Machine Translation models through relevant contexts
Figure 4 for How sensitive are translation systems to extra contexts? Mitigating gender bias in Neural Machine Translation models through relevant contexts
Viaarxiv icon

PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts

Add code
Feb 02, 2022
Figure 1 for PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts
Figure 2 for PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts
Figure 3 for PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts
Figure 4 for PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts
Viaarxiv icon

Between words and characters: A Brief History of Open-Vocabulary Modeling and Tokenization in NLP

Add code
Dec 20, 2021
Figure 1 for Between words and characters: A Brief History of Open-Vocabulary Modeling and Tokenization in NLP
Viaarxiv icon

Multitask Prompted Training Enables Zero-Shot Task Generalization

Add code
Oct 15, 2021
Figure 1 for Multitask Prompted Training Enables Zero-Shot Task Generalization
Figure 2 for Multitask Prompted Training Enables Zero-Shot Task Generalization
Figure 3 for Multitask Prompted Training Enables Zero-Shot Task Generalization
Figure 4 for Multitask Prompted Training Enables Zero-Shot Task Generalization
Viaarxiv icon

Evaluating Gender Bias in Natural Language Inference

Add code
May 12, 2021
Figure 1 for Evaluating Gender Bias in Natural Language Inference
Figure 2 for Evaluating Gender Bias in Natural Language Inference
Figure 3 for Evaluating Gender Bias in Natural Language Inference
Figure 4 for Evaluating Gender Bias in Natural Language Inference
Viaarxiv icon