Alert button
Picture for Donald Metzler

Donald Metzler

Alert button

Transformer Memory as a Differentiable Search Index

Add code
Bookmark button
Alert button
Feb 16, 2022
Yi Tay, Vinh Q. Tran, Mostafa Dehghani, Jianmo Ni, Dara Bahri, Harsh Mehta, Zhen Qin, Kai Hui, Zhe Zhao, Jai Gupta, Tal Schuster, William W. Cohen, Donald Metzler

Figure 1 for Transformer Memory as a Differentiable Search Index
Figure 2 for Transformer Memory as a Differentiable Search Index
Figure 3 for Transformer Memory as a Differentiable Search Index
Figure 4 for Transformer Memory as a Differentiable Search Index
Viaarxiv icon

Atomized Search Length: Beyond User Models

Add code
Bookmark button
Alert button
Jan 05, 2022
John Alex, Keith Hall, Donald Metzler

Figure 1 for Atomized Search Length: Beyond User Models
Figure 2 for Atomized Search Length: Beyond User Models
Figure 3 for Atomized Search Length: Beyond User Models
Figure 4 for Atomized Search Length: Beyond User Models
Viaarxiv icon

ExT5: Towards Extreme Multi-Task Scaling for Transfer Learning

Add code
Bookmark button
Alert button
Nov 22, 2021
Vamsi Aribandi, Yi Tay, Tal Schuster, Jinfeng Rao, Huaixiu Steven Zheng, Sanket Vaibhav Mehta, Honglei Zhuang, Vinh Q. Tran, Dara Bahri, Jianmo Ni, Jai Gupta, Kai Hui, Sebastian Ruder, Donald Metzler

Figure 1 for ExT5: Towards Extreme Multi-Task Scaling for Transfer Learning
Figure 2 for ExT5: Towards Extreme Multi-Task Scaling for Transfer Learning
Figure 3 for ExT5: Towards Extreme Multi-Task Scaling for Transfer Learning
Figure 4 for ExT5: Towards Extreme Multi-Task Scaling for Transfer Learning
Viaarxiv icon

Scale Efficiently: Insights from Pre-training and Fine-tuning Transformers

Add code
Bookmark button
Alert button
Sep 22, 2021
Yi Tay, Mostafa Dehghani, Jinfeng Rao, William Fedus, Samira Abnar, Hyung Won Chung, Sharan Narang, Dani Yogatama, Ashish Vaswani, Donald Metzler

Figure 1 for Scale Efficiently: Insights from Pre-training and Fine-tuning Transformers
Figure 2 for Scale Efficiently: Insights from Pre-training and Fine-tuning Transformers
Figure 3 for Scale Efficiently: Insights from Pre-training and Fine-tuning Transformers
Figure 4 for Scale Efficiently: Insights from Pre-training and Fine-tuning Transformers
Viaarxiv icon

The Benchmark Lottery

Add code
Bookmark button
Alert button
Jul 14, 2021
Mostafa Dehghani, Yi Tay, Alexey A. Gritsenko, Zhe Zhao, Neil Houlsby, Fernando Diaz, Donald Metzler, Oriol Vinyals

Figure 1 for The Benchmark Lottery
Figure 2 for The Benchmark Lottery
Figure 3 for The Benchmark Lottery
Figure 4 for The Benchmark Lottery
Viaarxiv icon

Charformer: Fast Character Transformers via Gradient-based Subword Tokenization

Add code
Bookmark button
Alert button
Jul 02, 2021
Yi Tay, Vinh Q. Tran, Sebastian Ruder, Jai Gupta, Hyung Won Chung, Dara Bahri, Zhen Qin, Simon Baumgartner, Cong Yu, Donald Metzler

Figure 1 for Charformer: Fast Character Transformers via Gradient-based Subword Tokenization
Figure 2 for Charformer: Fast Character Transformers via Gradient-based Subword Tokenization
Figure 3 for Charformer: Fast Character Transformers via Gradient-based Subword Tokenization
Figure 4 for Charformer: Fast Character Transformers via Gradient-based Subword Tokenization
Viaarxiv icon

SCARF: Self-Supervised Contrastive Learning using Random Feature Corruption

Add code
Bookmark button
Alert button
Jun 29, 2021
Dara Bahri, Heinrich Jiang, Yi Tay, Donald Metzler

Figure 1 for SCARF: Self-Supervised Contrastive Learning using Random Feature Corruption
Figure 2 for SCARF: Self-Supervised Contrastive Learning using Random Feature Corruption
Figure 3 for SCARF: Self-Supervised Contrastive Learning using Random Feature Corruption
Figure 4 for SCARF: Self-Supervised Contrastive Learning using Random Feature Corruption
Viaarxiv icon

How Reliable are Model Diagnostics?

Add code
Bookmark button
Alert button
May 12, 2021
Vamsi Aribandi, Yi Tay, Donald Metzler

Figure 1 for How Reliable are Model Diagnostics?
Figure 2 for How Reliable are Model Diagnostics?
Figure 3 for How Reliable are Model Diagnostics?
Figure 4 for How Reliable are Model Diagnostics?
Viaarxiv icon

Are Pre-trained Convolutions Better than Pre-trained Transformers?

Add code
Bookmark button
Alert button
May 07, 2021
Yi Tay, Mostafa Dehghani, Jai Gupta, Dara Bahri, Vamsi Aribandi, Zhen Qin, Donald Metzler

Figure 1 for Are Pre-trained Convolutions Better than Pre-trained Transformers?
Figure 2 for Are Pre-trained Convolutions Better than Pre-trained Transformers?
Figure 3 for Are Pre-trained Convolutions Better than Pre-trained Transformers?
Figure 4 for Are Pre-trained Convolutions Better than Pre-trained Transformers?
Viaarxiv icon