Alert button
Picture for Yi Tay

Yi Tay

Alert button

DSI++: Updating Transformer Memory with New Documents

Add code
Bookmark button
Alert button
Dec 19, 2022
Sanket Vaibhav Mehta, Jai Gupta, Yi Tay, Mostafa Dehghani, Vinh Q. Tran, Jinfeng Rao, Marc Najork, Emma Strubell, Donald Metzler

Figure 1 for DSI++: Updating Transformer Memory with New Documents
Figure 2 for DSI++: Updating Transformer Memory with New Documents
Figure 3 for DSI++: Updating Transformer Memory with New Documents
Figure 4 for DSI++: Updating Transformer Memory with New Documents
Viaarxiv icon

Dense Feature Memory Augmented Transformers for COVID-19 Vaccination Search Classification

Add code
Bookmark button
Alert button
Dec 16, 2022
Jai Gupta, Yi Tay, Chaitanya Kamath, Vinh Q. Tran, Donald Metzler, Shailesh Bavadekar, Mimi Sun, Evgeniy Gabrilovich

Figure 1 for Dense Feature Memory Augmented Transformers for COVID-19 Vaccination Search Classification
Figure 2 for Dense Feature Memory Augmented Transformers for COVID-19 Vaccination Search Classification
Figure 3 for Dense Feature Memory Augmented Transformers for COVID-19 Vaccination Search Classification
Figure 4 for Dense Feature Memory Augmented Transformers for COVID-19 Vaccination Search Classification
Viaarxiv icon

Sparse Upcycling: Training Mixture-of-Experts from Dense Checkpoints

Add code
Bookmark button
Alert button
Dec 09, 2022
Aran Komatsuzaki, Joan Puigcerver, James Lee-Thorp, Carlos Riquelme Ruiz, Basil Mustafa, Joshua Ainslie, Yi Tay, Mostafa Dehghani, Neil Houlsby

Figure 1 for Sparse Upcycling: Training Mixture-of-Experts from Dense Checkpoints
Figure 2 for Sparse Upcycling: Training Mixture-of-Experts from Dense Checkpoints
Figure 3 for Sparse Upcycling: Training Mixture-of-Experts from Dense Checkpoints
Figure 4 for Sparse Upcycling: Training Mixture-of-Experts from Dense Checkpoints
Viaarxiv icon

Inverse scaling can become U-shaped

Add code
Bookmark button
Alert button
Nov 14, 2022
Jason Wei, Yi Tay, Quoc V. Le

Figure 1 for Inverse scaling can become U-shaped
Figure 2 for Inverse scaling can become U-shaped
Figure 3 for Inverse scaling can become U-shaped
Figure 4 for Inverse scaling can become U-shaped
Viaarxiv icon

Scaling Instruction-Finetuned Language Models

Add code
Bookmark button
Alert button
Oct 20, 2022
Hyung Won Chung, Le Hou, Shayne Longpre, Barret Zoph, Yi Tay, William Fedus, Eric Li, Xuezhi Wang, Mostafa Dehghani, Siddhartha Brahma, Albert Webson, Shixiang Shane Gu, Zhuyun Dai, Mirac Suzgun, Xinyun Chen, Aakanksha Chowdhery, Sharan Narang, Gaurav Mishra, Adams Yu, Vincent Zhao, Yanping Huang, Andrew Dai, Hongkun Yu, Slav Petrov, Ed H. Chi, Jeff Dean, Jacob Devlin, Adam Roberts, Denny Zhou, Quoc V. Le, Jason Wei

Figure 1 for Scaling Instruction-Finetuned Language Models
Figure 2 for Scaling Instruction-Finetuned Language Models
Figure 3 for Scaling Instruction-Finetuned Language Models
Figure 4 for Scaling Instruction-Finetuned Language Models
Viaarxiv icon

Transcending Scaling Laws with 0.1% Extra Compute

Add code
Bookmark button
Alert button
Oct 20, 2022
Yi Tay, Jason Wei, Hyung Won Chung, Vinh Q. Tran, David R. So, Siamak Shakeri, Xavier Garcia, Huaixiu Steven Zheng, Jinfeng Rao, Aakanksha Chowdhery, Denny Zhou, Donald Metzler, Slav Petrov, Neil Houlsby, Quoc V. Le, Mostafa Dehghani

Figure 1 for Transcending Scaling Laws with 0.1% Extra Compute
Figure 2 for Transcending Scaling Laws with 0.1% Extra Compute
Figure 3 for Transcending Scaling Laws with 0.1% Extra Compute
Figure 4 for Transcending Scaling Laws with 0.1% Extra Compute
Viaarxiv icon

Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them

Add code
Bookmark button
Alert button
Oct 17, 2022
Mirac Suzgun, Nathan Scales, Nathanael Schärli, Sebastian Gehrmann, Yi Tay, Hyung Won Chung, Aakanksha Chowdhery, Quoc V. Le, Ed H. Chi, Denny Zhou, Jason Wei

Figure 1 for Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them
Figure 2 for Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them
Figure 3 for Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them
Figure 4 for Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them
Viaarxiv icon

Language Models are Multilingual Chain-of-Thought Reasoners

Add code
Bookmark button
Alert button
Oct 06, 2022
Freda Shi, Mirac Suzgun, Markus Freitag, Xuezhi Wang, Suraj Srivats, Soroush Vosoughi, Hyung Won Chung, Yi Tay, Sebastian Ruder, Denny Zhou, Dipanjan Das, Jason Wei

Figure 1 for Language Models are Multilingual Chain-of-Thought Reasoners
Figure 2 for Language Models are Multilingual Chain-of-Thought Reasoners
Figure 3 for Language Models are Multilingual Chain-of-Thought Reasoners
Figure 4 for Language Models are Multilingual Chain-of-Thought Reasoners
Viaarxiv icon

Recitation-Augmented Language Models

Add code
Bookmark button
Alert button
Oct 04, 2022
Zhiqing Sun, Xuezhi Wang, Yi Tay, Yiming Yang, Denny Zhou

Figure 1 for Recitation-Augmented Language Models
Figure 2 for Recitation-Augmented Language Models
Figure 3 for Recitation-Augmented Language Models
Figure 4 for Recitation-Augmented Language Models
Viaarxiv icon