Alert button
Picture for Joshua Ainslie

Joshua Ainslie

Alert button

CoLT5: Faster Long-Range Transformers with Conditional Computation

Mar 17, 2023
Joshua Ainslie, Tao Lei, Michiel de Jong, Santiago Ontañón, Siddhartha Brahma, Yury Zemlyanskiy, David Uthus, Mandy Guo, James Lee-Thorp, Yi Tay, Yun-Hsuan Sung, Sumit Sanghai

Figure 1 for CoLT5: Faster Long-Range Transformers with Conditional Computation
Figure 2 for CoLT5: Faster Long-Range Transformers with Conditional Computation
Figure 3 for CoLT5: Faster Long-Range Transformers with Conditional Computation
Figure 4 for CoLT5: Faster Long-Range Transformers with Conditional Computation
Viaarxiv icon

Pre-computed memory or on-the-fly encoding? A hybrid approach to retrieval augmentation makes the most of your compute

Jan 25, 2023
Michiel de Jong, Yury Zemlyanskiy, Nicholas FitzGerald, Joshua Ainslie, Sumit Sanghai, Fei Sha, William Cohen

Figure 1 for Pre-computed memory or on-the-fly encoding? A hybrid approach to retrieval augmentation makes the most of your compute
Figure 2 for Pre-computed memory or on-the-fly encoding? A hybrid approach to retrieval augmentation makes the most of your compute
Figure 3 for Pre-computed memory or on-the-fly encoding? A hybrid approach to retrieval augmentation makes the most of your compute
Figure 4 for Pre-computed memory or on-the-fly encoding? A hybrid approach to retrieval augmentation makes the most of your compute
Viaarxiv icon

FiDO: Fusion-in-Decoder optimized for stronger performance and faster inference

Dec 15, 2022
Michiel de Jong, Yury Zemlyanskiy, Joshua Ainslie, Nicholas FitzGerald, Sumit Sanghai, Fei Sha, William Cohen

Figure 1 for FiDO: Fusion-in-Decoder optimized for stronger performance and faster inference
Figure 2 for FiDO: Fusion-in-Decoder optimized for stronger performance and faster inference
Figure 3 for FiDO: Fusion-in-Decoder optimized for stronger performance and faster inference
Figure 4 for FiDO: Fusion-in-Decoder optimized for stronger performance and faster inference
Viaarxiv icon

Sparse Upcycling: Training Mixture-of-Experts from Dense Checkpoints

Dec 09, 2022
Aran Komatsuzaki, Joan Puigcerver, James Lee-Thorp, Carlos Riquelme Ruiz, Basil Mustafa, Joshua Ainslie, Yi Tay, Mostafa Dehghani, Neil Houlsby

Figure 1 for Sparse Upcycling: Training Mixture-of-Experts from Dense Checkpoints
Figure 2 for Sparse Upcycling: Training Mixture-of-Experts from Dense Checkpoints
Figure 3 for Sparse Upcycling: Training Mixture-of-Experts from Dense Checkpoints
Figure 4 for Sparse Upcycling: Training Mixture-of-Experts from Dense Checkpoints
Viaarxiv icon

Generate-and-Retrieve: use your predictions to improve retrieval for semantic parsing

Sep 29, 2022
Yury Zemlyanskiy, Michiel de Jong, Joshua Ainslie, Panupong Pasupat, Peter Shaw, Linlu Qiu, Sumit Sanghai, Fei Sha

Figure 1 for Generate-and-Retrieve: use your predictions to improve retrieval for semantic parsing
Figure 2 for Generate-and-Retrieve: use your predictions to improve retrieval for semantic parsing
Figure 3 for Generate-and-Retrieve: use your predictions to improve retrieval for semantic parsing
Figure 4 for Generate-and-Retrieve: use your predictions to improve retrieval for semantic parsing
Viaarxiv icon

Sparse Mixers: Combining MoE and Mixing to build a more efficient BERT

May 24, 2022
James Lee-Thorp, Joshua Ainslie

Figure 1 for Sparse Mixers: Combining MoE and Mixing to build a more efficient BERT
Figure 2 for Sparse Mixers: Combining MoE and Mixing to build a more efficient BERT
Figure 3 for Sparse Mixers: Combining MoE and Mixing to build a more efficient BERT
Figure 4 for Sparse Mixers: Combining MoE and Mixing to build a more efficient BERT
Viaarxiv icon

LogicInference: A New Dataset for Teaching Logical Inference to seq2seq Models

Apr 11, 2022
Santiago Ontanon, Joshua Ainslie, Vaclav Cvicek, Zachary Fisher

Figure 1 for LogicInference: A New Dataset for Teaching Logical Inference to seq2seq Models
Figure 2 for LogicInference: A New Dataset for Teaching Logical Inference to seq2seq Models
Figure 3 for LogicInference: A New Dataset for Teaching Logical Inference to seq2seq Models
Figure 4 for LogicInference: A New Dataset for Teaching Logical Inference to seq2seq Models
Viaarxiv icon

FormNet: Structural Encoding beyond Sequential Modeling in Form Document Information Extraction

Mar 24, 2022
Chen-Yu Lee, Chun-Liang Li, Timothy Dozat, Vincent Perot, Guolong Su, Nan Hua, Joshua Ainslie, Renshen Wang, Yasuhisa Fujii, Tomas Pfister

Figure 1 for FormNet: Structural Encoding beyond Sequential Modeling in Form Document Information Extraction
Figure 2 for FormNet: Structural Encoding beyond Sequential Modeling in Form Document Information Extraction
Figure 3 for FormNet: Structural Encoding beyond Sequential Modeling in Form Document Information Extraction
Figure 4 for FormNet: Structural Encoding beyond Sequential Modeling in Form Document Information Extraction
Viaarxiv icon

LongT5: Efficient Text-To-Text Transformer for Long Sequences

Dec 15, 2021
Mandy Guo, Joshua Ainslie, David Uthus, Santiago Ontanon, Jianmo Ni, Yun-Hsuan Sung, Yinfei Yang

Figure 1 for LongT5: Efficient Text-To-Text Transformer for Long Sequences
Figure 2 for LongT5: Efficient Text-To-Text Transformer for Long Sequences
Figure 3 for LongT5: Efficient Text-To-Text Transformer for Long Sequences
Figure 4 for LongT5: Efficient Text-To-Text Transformer for Long Sequences
Viaarxiv icon

Iterative Decoding for Compositional Generalization in Transformers

Oct 08, 2021
Luana Ruiz, Joshua Ainslie, Santiago Ontañón

Figure 1 for Iterative Decoding for Compositional Generalization in Transformers
Figure 2 for Iterative Decoding for Compositional Generalization in Transformers
Figure 3 for Iterative Decoding for Compositional Generalization in Transformers
Figure 4 for Iterative Decoding for Compositional Generalization in Transformers
Viaarxiv icon