Alert button
Picture for Zaid Alyafeai

Zaid Alyafeai

Alert button

Documenting Geographically and Contextually Diverse Data Sources: The BigScience Catalogue of Language Data and Resources

Add code
Bookmark button
Alert button
Jan 25, 2022
Angelina McMillan-Major, Zaid Alyafeai, Stella Biderman, Kimbo Chen, Francesco De Toni, Gérard Dupont, Hady Elsahar, Chris Emezue, Alham Fikri Aji, Suzana Ilić, Nurulaqilla Khamis, Colin Leong, Maraim Masoud, Aitor Soroa, Pedro Ortiz Suarez, Zeerak Talat, Daniel van Strien, Yacine Jernite

Figure 1 for Documenting Geographically and Contextually Diverse Data Sources: The BigScience Catalogue of Language Data and Resources
Figure 2 for Documenting Geographically and Contextually Diverse Data Sources: The BigScience Catalogue of Language Data and Resources
Figure 3 for Documenting Geographically and Contextually Diverse Data Sources: The BigScience Catalogue of Language Data and Resources
Figure 4 for Documenting Geographically and Contextually Diverse Data Sources: The BigScience Catalogue of Language Data and Resources
Viaarxiv icon

Between words and characters: A Brief History of Open-Vocabulary Modeling and Tokenization in NLP

Add code
Bookmark button
Alert button
Dec 20, 2021
Sabrina J. Mielke, Zaid Alyafeai, Elizabeth Salesky, Colin Raffel, Manan Dey, Matthias Gallé, Arun Raja, Chenglei Si, Wilson Y. Lee, Benoît Sagot, Samson Tan

Figure 1 for Between words and characters: A Brief History of Open-Vocabulary Modeling and Tokenization in NLP
Viaarxiv icon

Multitask Prompted Training Enables Zero-Shot Task Generalization

Add code
Bookmark button
Alert button
Oct 15, 2021
Victor Sanh, Albert Webson, Colin Raffel, Stephen H. Bach, Lintang Sutawika, Zaid Alyafeai, Antoine Chaffin, Arnaud Stiegler, Teven Le Scao, Arun Raja, Manan Dey, M Saiful Bari, Canwen Xu, Urmish Thakker, Shanya Sharma Sharma, Eliza Szczechla, Taewoon Kim, Gunjan Chhablani, Nihal Nayak, Debajyoti Datta, Jonathan Chang, Mike Tian-Jian Jiang, Han Wang, Matteo Manica, Sheng Shen, Zheng Xin Yong, Harshit Pandey, Rachel Bawden, Thomas Wang, Trishala Neeraj, Jos Rozen, Abheesht Sharma, Andrea Santilli, Thibault Fevry, Jason Alan Fries, Ryan Teehan, Stella Biderman, Leo Gao, Tali Bers, Thomas Wolf, Alexander M. Rush

Figure 1 for Multitask Prompted Training Enables Zero-Shot Task Generalization
Figure 2 for Multitask Prompted Training Enables Zero-Shot Task Generalization
Figure 3 for Multitask Prompted Training Enables Zero-Shot Task Generalization
Figure 4 for Multitask Prompted Training Enables Zero-Shot Task Generalization
Viaarxiv icon

Masader: Metadata Sourcing for Arabic Text and Speech Data Resources

Add code
Bookmark button
Alert button
Oct 13, 2021
Zaid Alyafeai, Maraim Masoud, Mustafa Ghaleb, Maged S. Al-shaibani

Figure 1 for Masader: Metadata Sourcing for Arabic Text and Speech Data Resources
Figure 2 for Masader: Metadata Sourcing for Arabic Text and Speech Data Resources
Figure 3 for Masader: Metadata Sourcing for Arabic Text and Speech Data Resources
Figure 4 for Masader: Metadata Sourcing for Arabic Text and Speech Data Resources
Viaarxiv icon

Calliar: An Online Handwritten Dataset for Arabic Calligraphy

Add code
Bookmark button
Alert button
Jun 25, 2021
Zaid Alyafeai, Maged S. Al-shaibani, Mustafa Ghaleb, Yousif Ahmed Al-Wajih

Figure 1 for Calliar: An Online Handwritten Dataset for Arabic Calligraphy
Figure 2 for Calliar: An Online Handwritten Dataset for Arabic Calligraphy
Figure 3 for Calliar: An Online Handwritten Dataset for Arabic Calligraphy
Figure 4 for Calliar: An Online Handwritten Dataset for Arabic Calligraphy
Viaarxiv icon

Evaluating Various Tokenizers for Arabic Text Classification

Add code
Bookmark button
Alert button
Jun 14, 2021
Zaid Alyafeai, Maged S. Al-shaibani, Mustafa Ghaleb, Irfan Ahmad

Figure 1 for Evaluating Various Tokenizers for Arabic Text Classification
Figure 2 for Evaluating Various Tokenizers for Arabic Text Classification
Figure 3 for Evaluating Various Tokenizers for Arabic Text Classification
Figure 4 for Evaluating Various Tokenizers for Arabic Text Classification
Viaarxiv icon