Alert button
Picture for Li Lucy

Li Lucy

Alert button

Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research

Add code
Bookmark button
Alert button
Jan 31, 2024
Luca Soldaini, Rodney Kinney, Akshita Bhagia, Dustin Schwenk, David Atkinson, Russell Authur, Ben Bogin, Khyathi Chandu, Jennifer Dumas, Yanai Elazar, Valentin Hofmann, Ananya Harsh Jha, Sachin Kumar, Li Lucy, Xinxi Lyu, Nathan Lambert, Ian Magnusson, Jacob Morrison, Niklas Muennighoff, Aakanksha Naik, Crystal Nam, Matthew E. Peters, Abhilasha Ravichander, Kyle Richardson, Zejiang Shen, Emma Strubell, Nishant Subramani, Oyvind Tafjord, Pete Walsh, Luke Zettlemoyer, Noah A. Smith, Hannaneh Hajishirzi, Iz Beltagy, Dirk Groeneveld, Jesse Dodge, Kyle Lo

Viaarxiv icon

AboutMe: Using Self-Descriptions in Webpages to Document the Effects of English Pretraining Data Filters

Add code
Bookmark button
Alert button
Jan 16, 2024
Li Lucy, Suchin Gururangan, Luca Soldaini, Emma Strubell, David Bamman, Lauren Klein, Jesse Dodge

Viaarxiv icon

"One-size-fits-all"? Observations and Expectations of NLG Systems Across Identity-Related Language Features

Add code
Bookmark button
Alert button
Oct 23, 2023
Li Lucy, Su Lin Blodgett, Milad Shokouhi, Hanna Wallach, Alexandra Olteanu

Figure 1 for "One-size-fits-all"? Observations and Expectations of NLG Systems Across Identity-Related Language Features
Figure 2 for "One-size-fits-all"? Observations and Expectations of NLG Systems Across Identity-Related Language Features
Figure 3 for "One-size-fits-all"? Observations and Expectations of NLG Systems Across Identity-Related Language Features
Figure 4 for "One-size-fits-all"? Observations and Expectations of NLG Systems Across Identity-Related Language Features
Viaarxiv icon

Words as Gatekeepers: Measuring Discipline-specific Terms and Meanings in Scholarly Publications

Add code
Bookmark button
Alert button
Dec 19, 2022
Li Lucy, Jesse Dodge, David Bamman, Katherine A. Keith

Figure 1 for Words as Gatekeepers: Measuring Discipline-specific Terms and Meanings in Scholarly Publications
Figure 2 for Words as Gatekeepers: Measuring Discipline-specific Terms and Meanings in Scholarly Publications
Figure 3 for Words as Gatekeepers: Measuring Discipline-specific Terms and Meanings in Scholarly Publications
Figure 4 for Words as Gatekeepers: Measuring Discipline-specific Terms and Meanings in Scholarly Publications
Viaarxiv icon

Characterizing English Variation across Social Media Communities with BERT

Add code
Bookmark button
Alert button
Feb 12, 2021
Li Lucy, David Bamman

Figure 1 for Characterizing English Variation across Social Media Communities with BERT
Figure 2 for Characterizing English Variation across Social Media Communities with BERT
Figure 3 for Characterizing English Variation across Social Media Communities with BERT
Figure 4 for Characterizing English Variation across Social Media Communities with BERT
Viaarxiv icon

Using Sentiment Induction to Understand Variation in Gendered Online Communities

Add code
Bookmark button
Alert button
Nov 16, 2018
Li Lucy, Julia Mendelsohn

Figure 1 for Using Sentiment Induction to Understand Variation in Gendered Online Communities
Figure 2 for Using Sentiment Induction to Understand Variation in Gendered Online Communities
Figure 3 for Using Sentiment Induction to Understand Variation in Gendered Online Communities
Figure 4 for Using Sentiment Induction to Understand Variation in Gendered Online Communities
Viaarxiv icon

Are distributional representations ready for the real world? Evaluating word vectors for grounded perceptual meaning

Add code
Bookmark button
Alert button
May 31, 2017
Li Lucy, Jon Gauthier

Figure 1 for Are distributional representations ready for the real world? Evaluating word vectors for grounded perceptual meaning
Figure 2 for Are distributional representations ready for the real world? Evaluating word vectors for grounded perceptual meaning
Figure 3 for Are distributional representations ready for the real world? Evaluating word vectors for grounded perceptual meaning
Figure 4 for Are distributional representations ready for the real world? Evaluating word vectors for grounded perceptual meaning
Viaarxiv icon