Alert button
Picture for Yi Tay

Yi Tay

Alert button

Label Smoothed Embedding Hypothesis for Out-of-Distribution Detection

Add code
Bookmark button
Alert button
Feb 09, 2021
Dara Bahri, Heinrich Jiang, Yi Tay, Donald Metzler

Figure 1 for Label Smoothed Embedding Hypothesis for Out-of-Distribution Detection
Figure 2 for Label Smoothed Embedding Hypothesis for Out-of-Distribution Detection
Figure 3 for Label Smoothed Embedding Hypothesis for Out-of-Distribution Detection
Figure 4 for Label Smoothed Embedding Hypothesis for Out-of-Distribution Detection
Viaarxiv icon

StructFormer: Joint Unsupervised Induction of Dependency and Constituency Structure from Masked Language Modeling

Add code
Bookmark button
Alert button
Dec 15, 2020
Yikang Shen, Yi Tay, Che Zheng, Dara Bahri, Donald Metzler, Aaron Courville

Figure 1 for StructFormer: Joint Unsupervised Induction of Dependency and Constituency Structure from Masked Language Modeling
Figure 2 for StructFormer: Joint Unsupervised Induction of Dependency and Constituency Structure from Masked Language Modeling
Figure 3 for StructFormer: Joint Unsupervised Induction of Dependency and Constituency Structure from Masked Language Modeling
Figure 4 for StructFormer: Joint Unsupervised Induction of Dependency and Constituency Structure from Masked Language Modeling
Viaarxiv icon

Long Range Arena: A Benchmark for Efficient Transformers

Add code
Bookmark button
Alert button
Nov 08, 2020
Yi Tay, Mostafa Dehghani, Samira Abnar, Yikang Shen, Dara Bahri, Philip Pham, Jinfeng Rao, Liu Yang, Sebastian Ruder, Donald Metzler

Figure 1 for Long Range Arena: A Benchmark for Efficient Transformers
Figure 2 for Long Range Arena: A Benchmark for Efficient Transformers
Figure 3 for Long Range Arena: A Benchmark for Efficient Transformers
Figure 4 for Long Range Arena: A Benchmark for Efficient Transformers
Viaarxiv icon

Surprise: Result List Truncation via Extreme Value Theory

Add code
Bookmark button
Alert button
Oct 19, 2020
Dara Bahri, Che Zheng, Yi Tay, Donald Metzler, Andrew Tomkins

Figure 1 for Surprise: Result List Truncation via Extreme Value Theory
Figure 2 for Surprise: Result List Truncation via Extreme Value Theory
Figure 3 for Surprise: Result List Truncation via Extreme Value Theory
Figure 4 for Surprise: Result List Truncation via Extreme Value Theory
Viaarxiv icon

Poison Attacks against Text Datasets with Conditional Adversarially Regularized Autoencoder

Add code
Bookmark button
Alert button
Oct 06, 2020
Alvin Chan, Yi Tay, Yew-Soon Ong, Aston Zhang

Figure 1 for Poison Attacks against Text Datasets with Conditional Adversarially Regularized Autoencoder
Figure 2 for Poison Attacks against Text Datasets with Conditional Adversarially Regularized Autoencoder
Figure 3 for Poison Attacks against Text Datasets with Conditional Adversarially Regularized Autoencoder
Figure 4 for Poison Attacks against Text Datasets with Conditional Adversarially Regularized Autoencoder
Viaarxiv icon

Efficient Transformers: A Survey

Add code
Bookmark button
Alert button
Sep 16, 2020
Yi Tay, Mostafa Dehghani, Dara Bahri, Donald Metzler

Figure 1 for Efficient Transformers: A Survey
Figure 2 for Efficient Transformers: A Survey
Figure 3 for Efficient Transformers: A Survey
Figure 4 for Efficient Transformers: A Survey
Viaarxiv icon

Generative Models are Unsupervised Predictors of Page Quality: A Colossal-Scale Study

Add code
Bookmark button
Alert button
Aug 17, 2020
Dara Bahri, Yi Tay, Che Zheng, Donald Metzler, Cliff Brunk, Andrew Tomkins

Figure 1 for Generative Models are Unsupervised Predictors of Page Quality: A Colossal-Scale Study
Figure 2 for Generative Models are Unsupervised Predictors of Page Quality: A Colossal-Scale Study
Figure 3 for Generative Models are Unsupervised Predictors of Page Quality: A Colossal-Scale Study
Figure 4 for Generative Models are Unsupervised Predictors of Page Quality: A Colossal-Scale Study
Viaarxiv icon

HyperGrid: Efficient Multi-Task Transformers with Grid-wise Decomposable Hyper Projections

Add code
Bookmark button
Alert button
Jul 12, 2020
Yi Tay, Zhe Zhao, Dara Bahri, Donald Metzler, Da-Cheng Juan

Figure 1 for HyperGrid: Efficient Multi-Task Transformers with Grid-wise Decomposable Hyper Projections
Figure 2 for HyperGrid: Efficient Multi-Task Transformers with Grid-wise Decomposable Hyper Projections
Figure 3 for HyperGrid: Efficient Multi-Task Transformers with Grid-wise Decomposable Hyper Projections
Figure 4 for HyperGrid: Efficient Multi-Task Transformers with Grid-wise Decomposable Hyper Projections
Viaarxiv icon

Synthesizer: Rethinking Self-Attention in Transformer Models

Add code
Bookmark button
Alert button
May 02, 2020
Yi Tay, Dara Bahri, Donald Metzler, Da-Cheng Juan, Zhe Zhao, Che Zheng

Figure 1 for Synthesizer: Rethinking Self-Attention in Transformer Models
Figure 2 for Synthesizer: Rethinking Self-Attention in Transformer Models
Figure 3 for Synthesizer: Rethinking Self-Attention in Transformer Models
Figure 4 for Synthesizer: Rethinking Self-Attention in Transformer Models
Viaarxiv icon