Alert button
Picture for Pete Walsh

Pete Walsh

Alert button

OLMo: Accelerating the Science of Language Models

Add code
Bookmark button
Alert button
Feb 07, 2024
Dirk Groeneveld, Iz Beltagy, Pete Walsh, Akshita Bhagia, Rodney Kinney, Oyvind Tafjord, Ananya Harsh Jha, Hamish Ivison, Ian Magnusson, Yizhong Wang, Shane Arora, David Atkinson, Russell Authur, Khyathi Raghavi Chandu, Arman Cohan, Jennifer Dumas, Yanai Elazar, Yuling Gu, Jack Hessel, Tushar Khot, William Merrill, Jacob Morrison, Niklas Muennighoff, Aakanksha Naik, Crystal Nam, Matthew E. Peters, Valentina Pyatkin, Abhilasha Ravichander, Dustin Schwenk, Saurabh Shah, Will Smith, Emma Strubell, Nishant Subramani, Mitchell Wortsman, Pradeep Dasigi, Nathan Lambert, Kyle Richardson, Luke Zettlemoyer, Jesse Dodge, Kyle Lo, Luca Soldaini, Noah A. Smith, Hannaneh Hajishirzi

Viaarxiv icon

Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research

Add code
Bookmark button
Alert button
Jan 31, 2024
Luca Soldaini, Rodney Kinney, Akshita Bhagia, Dustin Schwenk, David Atkinson, Russell Authur, Ben Bogin, Khyathi Chandu, Jennifer Dumas, Yanai Elazar, Valentin Hofmann, Ananya Harsh Jha, Sachin Kumar, Li Lucy, Xinxi Lyu, Nathan Lambert, Ian Magnusson, Jacob Morrison, Niklas Muennighoff, Aakanksha Naik, Crystal Nam, Matthew E. Peters, Abhilasha Ravichander, Kyle Richardson, Zejiang Shen, Emma Strubell, Nishant Subramani, Oyvind Tafjord, Pete Walsh, Luke Zettlemoyer, Noah A. Smith, Hannaneh Hajishirzi, Iz Beltagy, Dirk Groeneveld, Jesse Dodge, Kyle Lo

Viaarxiv icon

Catwalk: A Unified Language Model Evaluation Framework for Many Datasets

Add code
Bookmark button
Alert button
Dec 15, 2023
Dirk Groeneveld, Anas Awadalla, Iz Beltagy, Akshita Bhagia, Ian Magnusson, Hao Peng, Oyvind Tafjord, Pete Walsh, Kyle Richardson, Jesse Dodge

Viaarxiv icon

What's In My Big Data?

Add code
Bookmark button
Alert button
Oct 31, 2023
Yanai Elazar, Akshita Bhagia, Ian Magnusson, Abhilasha Ravichander, Dustin Schwenk, Alane Suhr, Pete Walsh, Dirk Groeneveld, Luca Soldaini, Sameer Singh, Hanna Hajishirzi, Noah A. Smith, Jesse Dodge

Figure 1 for What's In My Big Data?
Figure 2 for What's In My Big Data?
Figure 3 for What's In My Big Data?
Figure 4 for What's In My Big Data?
Viaarxiv icon

Continued Pretraining for Better Zero- and Few-Shot Promptability

Add code
Bookmark button
Alert button
Oct 19, 2022
Zhaofeng Wu, Robert L. Logan IV, Pete Walsh, Akshita Bhagia, Dirk Groeneveld, Sameer Singh, Iz Beltagy

Figure 1 for Continued Pretraining for Better Zero- and Few-Shot Promptability
Figure 2 for Continued Pretraining for Better Zero- and Few-Shot Promptability
Figure 3 for Continued Pretraining for Better Zero- and Few-Shot Promptability
Figure 4 for Continued Pretraining for Better Zero- and Few-Shot Promptability
Viaarxiv icon

Staged Training for Transformer Language Models

Add code
Bookmark button
Alert button
Mar 11, 2022
Sheng Shen, Pete Walsh, Kurt Keutzer, Jesse Dodge, Matthew Peters, Iz Beltagy

Figure 1 for Staged Training for Transformer Language Models
Figure 2 for Staged Training for Transformer Language Models
Figure 3 for Staged Training for Transformer Language Models
Figure 4 for Staged Training for Transformer Language Models
Viaarxiv icon