Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts


Feb 02, 2022
Stephen H. Bach, Victor Sanh, Zheng-Xin Yong, Albert Webson, Colin Raffel, Nihal V. Nayak, Abheesht Sharma, Taewoon Kim, M Saiful Bari, Thibault Fevry, Zaid Alyafeai, Manan Dey, Andrea Santilli, Zhiqing Sun, Srulik Ben-David, Canwen Xu, Gunjan Chhablani, Han Wang, Jason Alan Fries, Maged S. Al-shaibani, Shanya Sharma, Urmish Thakker, Khalid Almubarak, Xiangru Tang, Xiangru Tang, Mike Tian-Jian Jiang, Alexander M. Rush


  Access Paper or Ask Questions

Documenting Geographically and Contextually Diverse Data Sources: The BigScience Catalogue of Language Data and Resources


Jan 25, 2022
Angelina McMillan-Major, Zaid Alyafeai, Stella Biderman, Kimbo Chen, Francesco De Toni, Gérard Dupont, Hady Elsahar, Chris Emezue, Alham Fikri Aji, Suzana Ilić, Nurulaqilla Khamis, Colin Leong, Maraim Masoud, Aitor Soroa, Pedro Ortiz Suarez, Zeerak Talat, Daniel van Strien, Yacine Jernite

* 8 pages plus appendix and references 

  Access Paper or Ask Questions

Between words and characters: A Brief History of Open-Vocabulary Modeling and Tokenization in NLP


Dec 20, 2021
Sabrina J. Mielke, Zaid Alyafeai, Elizabeth Salesky, Colin Raffel, Manan Dey, Matthias Gallé, Arun Raja, Chenglei Si, Wilson Y. Lee, Benoît Sagot, Samson Tan

* 15 page preprint 

  Access Paper or Ask Questions

Multitask Prompted Training Enables Zero-Shot Task Generalization


Oct 15, 2021
Victor Sanh, Albert Webson, Colin Raffel, Stephen H. Bach, Lintang Sutawika, Zaid Alyafeai, Antoine Chaffin, Arnaud Stiegler, Teven Le Scao, Arun Raja, Manan Dey, M Saiful Bari, Canwen Xu, Urmish Thakker, Shanya Sharma Sharma, Eliza Szczechla, Taewoon Kim, Gunjan Chhablani, Nihal Nayak, Debajyoti Datta, Jonathan Chang, Mike Tian-Jian Jiang, Han Wang, Matteo Manica, Sheng Shen, Zheng Xin Yong, Harshit Pandey, Rachel Bawden, Thomas Wang, Trishala Neeraj, Jos Rozen, Abheesht Sharma, Andrea Santilli, Thibault Fevry, Jason Alan Fries, Ryan Teehan, Stella Biderman, Leo Gao, Tali Bers, Thomas Wolf, Alexander M. Rush

* https://github.com/bigscience-workshop/promptsource/ 

  Access Paper or Ask Questions

Masader: Metadata Sourcing for Arabic Text and Speech Data Resources


Oct 13, 2021
Zaid Alyafeai, Maraim Masoud, Mustafa Ghaleb, Maged S. Al-shaibani


  Access Paper or Ask Questions

Calliar: An Online Handwritten Dataset for Arabic Calligraphy


Jun 25, 2021
Zaid Alyafeai, Maged S. Al-shaibani, Mustafa Ghaleb, Yousif Ahmed Al-Wajih


  Access Paper or Ask Questions

Evaluating Various Tokenizers for Arabic Text Classification


Jun 14, 2021
Zaid Alyafeai, Maged S. Al-shaibani, Mustafa Ghaleb, Irfan Ahmad


  Access Paper or Ask Questions