Alert button
Picture for Colin Raffel

Colin Raffel

Alert button

Merging Models with Fisher-Weighted Averaging

Nov 18, 2021
Michael Matena, Colin Raffel

Figure 1 for Merging Models with Fisher-Weighted Averaging
Figure 2 for Merging Models with Fisher-Weighted Averaging
Figure 3 for Merging Models with Fisher-Weighted Averaging
Figure 4 for Merging Models with Fisher-Weighted Averaging
Viaarxiv icon

Multitask Prompted Training Enables Zero-Shot Task Generalization

Oct 15, 2021
Victor Sanh, Albert Webson, Colin Raffel, Stephen H. Bach, Lintang Sutawika, Zaid Alyafeai, Antoine Chaffin, Arnaud Stiegler, Teven Le Scao, Arun Raja, Manan Dey, M Saiful Bari, Canwen Xu, Urmish Thakker, Shanya Sharma Sharma, Eliza Szczechla, Taewoon Kim, Gunjan Chhablani, Nihal Nayak, Debajyoti Datta, Jonathan Chang, Mike Tian-Jian Jiang, Han Wang, Matteo Manica, Sheng Shen, Zheng Xin Yong, Harshit Pandey, Rachel Bawden, Thomas Wang, Trishala Neeraj, Jos Rozen, Abheesht Sharma, Andrea Santilli, Thibault Fevry, Jason Alan Fries, Ryan Teehan, Stella Biderman, Leo Gao, Tali Bers, Thomas Wolf, Alexander M. Rush

Figure 1 for Multitask Prompted Training Enables Zero-Shot Task Generalization
Figure 2 for Multitask Prompted Training Enables Zero-Shot Task Generalization
Figure 3 for Multitask Prompted Training Enables Zero-Shot Task Generalization
Figure 4 for Multitask Prompted Training Enables Zero-Shot Task Generalization
Viaarxiv icon

An Empirical Survey of Data Augmentation for Limited Data Learning in NLP

Jun 14, 2021
Jiaao Chen, Derek Tam, Colin Raffel, Mohit Bansal, Diyi Yang

Figure 1 for An Empirical Survey of Data Augmentation for Limited Data Learning in NLP
Figure 2 for An Empirical Survey of Data Augmentation for Limited Data Learning in NLP
Figure 3 for An Empirical Survey of Data Augmentation for Limited Data Learning in NLP
Figure 4 for An Empirical Survey of Data Augmentation for Limited Data Learning in NLP
Viaarxiv icon

On Training Sample Memorization: Lessons from Benchmarking Generative Modeling with a Large-scale Competition

Jun 06, 2021
Ching-Yuan Bai, Hsuan-Tien Lin, Colin Raffel, Wendy Chih-wen Kan

Figure 1 for On Training Sample Memorization: Lessons from Benchmarking Generative Modeling with a Large-scale Competition
Figure 2 for On Training Sample Memorization: Lessons from Benchmarking Generative Modeling with a Large-scale Competition
Figure 3 for On Training Sample Memorization: Lessons from Benchmarking Generative Modeling with a Large-scale Competition
Figure 4 for On Training Sample Memorization: Lessons from Benchmarking Generative Modeling with a Large-scale Competition
Viaarxiv icon

ByT5: Towards a token-free future with pre-trained byte-to-byte models

May 28, 2021
Linting Xue, Aditya Barua, Noah Constant, Rami Al-Rfou, Sharan Narang, Mihir Kale, Adam Roberts, Colin Raffel

Figure 1 for ByT5: Towards a token-free future with pre-trained byte-to-byte models
Figure 2 for ByT5: Towards a token-free future with pre-trained byte-to-byte models
Figure 3 for ByT5: Towards a token-free future with pre-trained byte-to-byte models
Figure 4 for ByT5: Towards a token-free future with pre-trained byte-to-byte models
Viaarxiv icon

Improving and Simplifying Pattern Exploiting Training

Mar 22, 2021
Derek Tam, Rakesh R Menon, Mohit Bansal, Shashank Srivastava, Colin Raffel

Figure 1 for Improving and Simplifying Pattern Exploiting Training
Figure 2 for Improving and Simplifying Pattern Exploiting Training
Figure 3 for Improving and Simplifying Pattern Exploiting Training
Figure 4 for Improving and Simplifying Pattern Exploiting Training
Viaarxiv icon

Do Transformer Modifications Transfer Across Implementations and Applications?

Feb 23, 2021
Sharan Narang, Hyung Won Chung, Yi Tay, William Fedus, Thibault Fevry, Michael Matena, Karishma Malkan, Noah Fiedel, Noam Shazeer, Zhenzhong Lan, Yanqi Zhou, Wei Li, Nan Ding, Jake Marcus, Adam Roberts, Colin Raffel

Figure 1 for Do Transformer Modifications Transfer Across Implementations and Applications?
Figure 2 for Do Transformer Modifications Transfer Across Implementations and Applications?
Figure 3 for Do Transformer Modifications Transfer Across Implementations and Applications?
Viaarxiv icon

NeurIPS 2020 EfficientQA Competition: Systems, Analyses and Lessons Learned

Jan 01, 2021
Sewon Min, Jordan Boyd-Graber, Chris Alberti, Danqi Chen, Eunsol Choi, Michael Collins, Kelvin Guu, Hannaneh Hajishirzi, Kenton Lee, Jennimaria Palomaki, Colin Raffel, Adam Roberts, Tom Kwiatkowski, Patrick Lewis, Yuxiang Wu, Heinrich Küttler, Linqing Liu, Pasquale Minervini, Pontus Stenetorp, Sebastian Riedel, Sohee Yang, Minjoon Seo, Gautier Izacard, Fabio Petroni, Lucas Hosseini, Nicola De Cao, Edouard Grave, Ikuya Yamada, Sonse Shimaoka, Masatoshi Suzuki, Shumpei Miyawaki, Shun Sato, Ryo Takahashi, Jun Suzuki, Martin Fajcik, Martin Docekal, Karel Ondrej, Pavel Smrz, Hao Cheng, Yelong Shen, Xiaodong Liu, Pengcheng He, Weizhu Chen, Jianfeng Gao, Barlas Oguz, Xilun Chen, Vladimir Karpukhin, Stan Peshterliev, Dmytro Okhonko, Michael Schlichtkrull, Sonal Gupta, Yashar Mehdad, Wen-tau Yih

Figure 1 for NeurIPS 2020 EfficientQA Competition: Systems, Analyses and Lessons Learned
Figure 2 for NeurIPS 2020 EfficientQA Competition: Systems, Analyses and Lessons Learned
Figure 3 for NeurIPS 2020 EfficientQA Competition: Systems, Analyses and Lessons Learned
Figure 4 for NeurIPS 2020 EfficientQA Competition: Systems, Analyses and Lessons Learned
Viaarxiv icon

Extracting Training Data from Large Language Models

Dec 14, 2020
Nicholas Carlini, Florian Tramer, Eric Wallace, Matthew Jagielski, Ariel Herbert-Voss, Katherine Lee, Adam Roberts, Tom Brown, Dawn Song, Ulfar Erlingsson, Alina Oprea, Colin Raffel

Figure 1 for Extracting Training Data from Large Language Models
Figure 2 for Extracting Training Data from Large Language Models
Figure 3 for Extracting Training Data from Large Language Models
Figure 4 for Extracting Training Data from Large Language Models
Viaarxiv icon

mT5: A massively multilingual pre-trained text-to-text transformer

Oct 23, 2020
Linting Xue, Noah Constant, Adam Roberts, Mihir Kale, Rami Al-Rfou, Aditya Siddhant, Aditya Barua, Colin Raffel

Figure 1 for mT5: A massively multilingual pre-trained text-to-text transformer
Figure 2 for mT5: A massively multilingual pre-trained text-to-text transformer
Figure 3 for mT5: A massively multilingual pre-trained text-to-text transformer
Figure 4 for mT5: A massively multilingual pre-trained text-to-text transformer
Viaarxiv icon