Picture for Yi Tay

Yi Tay

Larger language models do in-context learning differently

Add code
Mar 08, 2023
Figure 1 for Larger language models do in-context learning differently
Figure 2 for Larger language models do in-context learning differently
Figure 3 for Larger language models do in-context learning differently
Figure 4 for Larger language models do in-context learning differently
Viaarxiv icon

The Flan Collection: Designing Data and Methods for Effective Instruction Tuning

Add code
Feb 14, 2023
Figure 1 for The Flan Collection: Designing Data and Methods for Effective Instruction Tuning
Figure 2 for The Flan Collection: Designing Data and Methods for Effective Instruction Tuning
Figure 3 for The Flan Collection: Designing Data and Methods for Effective Instruction Tuning
Figure 4 for The Flan Collection: Designing Data and Methods for Effective Instruction Tuning
Viaarxiv icon

Scaling Vision Transformers to 22 Billion Parameters

Add code
Feb 10, 2023
Figure 1 for Scaling Vision Transformers to 22 Billion Parameters
Figure 2 for Scaling Vision Transformers to 22 Billion Parameters
Figure 3 for Scaling Vision Transformers to 22 Billion Parameters
Figure 4 for Scaling Vision Transformers to 22 Billion Parameters
Viaarxiv icon

DSI++: Updating Transformer Memory with New Documents

Add code
Dec 19, 2022
Figure 1 for DSI++: Updating Transformer Memory with New Documents
Figure 2 for DSI++: Updating Transformer Memory with New Documents
Figure 3 for DSI++: Updating Transformer Memory with New Documents
Figure 4 for DSI++: Updating Transformer Memory with New Documents
Viaarxiv icon

Dense Feature Memory Augmented Transformers for COVID-19 Vaccination Search Classification

Add code
Dec 16, 2022
Figure 1 for Dense Feature Memory Augmented Transformers for COVID-19 Vaccination Search Classification
Figure 2 for Dense Feature Memory Augmented Transformers for COVID-19 Vaccination Search Classification
Figure 3 for Dense Feature Memory Augmented Transformers for COVID-19 Vaccination Search Classification
Figure 4 for Dense Feature Memory Augmented Transformers for COVID-19 Vaccination Search Classification
Viaarxiv icon

Sparse Upcycling: Training Mixture-of-Experts from Dense Checkpoints

Add code
Dec 09, 2022
Figure 1 for Sparse Upcycling: Training Mixture-of-Experts from Dense Checkpoints
Figure 2 for Sparse Upcycling: Training Mixture-of-Experts from Dense Checkpoints
Figure 3 for Sparse Upcycling: Training Mixture-of-Experts from Dense Checkpoints
Figure 4 for Sparse Upcycling: Training Mixture-of-Experts from Dense Checkpoints
Viaarxiv icon

Inverse scaling can become U-shaped

Add code
Nov 14, 2022
Viaarxiv icon

Scaling Instruction-Finetuned Language Models

Add code
Oct 20, 2022
Figure 1 for Scaling Instruction-Finetuned Language Models
Figure 2 for Scaling Instruction-Finetuned Language Models
Figure 3 for Scaling Instruction-Finetuned Language Models
Figure 4 for Scaling Instruction-Finetuned Language Models
Viaarxiv icon

Transcending Scaling Laws with 0.1% Extra Compute

Add code
Oct 20, 2022
Figure 1 for Transcending Scaling Laws with 0.1% Extra Compute
Figure 2 for Transcending Scaling Laws with 0.1% Extra Compute
Figure 3 for Transcending Scaling Laws with 0.1% Extra Compute
Figure 4 for Transcending Scaling Laws with 0.1% Extra Compute
Viaarxiv icon

Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them

Add code
Oct 17, 2022
Figure 1 for Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them
Figure 2 for Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them
Figure 3 for Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them
Figure 4 for Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them
Viaarxiv icon