Get our free extension to see links to code for papers anywhere online!
Free add-on: code for papers everywhere!
Free add-on: See code for papers anywhere!
Add to Chrome
Add to Firefox
Add to Edge
CatalyzeX Icon
Search Icon
Code for
The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset
Github Icon
bigscience-workshop/data-preparation/blob/main/preprocessing/training/clean.py
Explore Code
Download Icon
Github Icon
bigscience-workshop/data_tooling/wiki/datasets-hackathon
Explore Code
Download Icon
Github Icon
streamlit/streamlit
Explore Code
Download Icon
Github Icon
bigscience-workshop/catalogue_data/blob/master/clean_helpers/stopwords.py
Explore Code
Download Icon
Github Icon
undertheseanlp/underthesea
Explore Code
Download Icon
Github Icon
ontocord/muliwai/tree/main
Explore Code
Download Icon