Alert button

CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages

Add code
Bookmark button
Alert button
Sep 17, 2023
Thuat Nguyen, Chien Van Nguyen, Viet Dac Lai, Hieu Man, Nghia Trung Ngo, Franck Dernoncourt, Ryan A. Rossi, Thien Huu Nguyen

Figure 1 for CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages
Figure 2 for CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages

Share this with someone who'll enjoy it:

View paper onarxiv icon

Share this with someone who'll enjoy it: