Picture for Dara Bahri

Dara Bahri

Long Range Arena: A Benchmark for Efficient Transformers

Add code
Nov 08, 2020
Figure 1 for Long Range Arena: A Benchmark for Efficient Transformers
Figure 2 for Long Range Arena: A Benchmark for Efficient Transformers
Figure 3 for Long Range Arena: A Benchmark for Efficient Transformers
Figure 4 for Long Range Arena: A Benchmark for Efficient Transformers
Viaarxiv icon

Surprise: Result List Truncation via Extreme Value Theory

Add code
Oct 19, 2020
Figure 1 for Surprise: Result List Truncation via Extreme Value Theory
Figure 2 for Surprise: Result List Truncation via Extreme Value Theory
Figure 3 for Surprise: Result List Truncation via Extreme Value Theory
Figure 4 for Surprise: Result List Truncation via Extreme Value Theory
Viaarxiv icon

Efficient Transformers: A Survey

Add code
Sep 16, 2020
Figure 1 for Efficient Transformers: A Survey
Figure 2 for Efficient Transformers: A Survey
Figure 3 for Efficient Transformers: A Survey
Figure 4 for Efficient Transformers: A Survey
Viaarxiv icon

Generative Models are Unsupervised Predictors of Page Quality: A Colossal-Scale Study

Add code
Aug 17, 2020
Figure 1 for Generative Models are Unsupervised Predictors of Page Quality: A Colossal-Scale Study
Figure 2 for Generative Models are Unsupervised Predictors of Page Quality: A Colossal-Scale Study
Figure 3 for Generative Models are Unsupervised Predictors of Page Quality: A Colossal-Scale Study
Figure 4 for Generative Models are Unsupervised Predictors of Page Quality: A Colossal-Scale Study
Viaarxiv icon

HyperGrid: Efficient Multi-Task Transformers with Grid-wise Decomposable Hyper Projections

Add code
Jul 12, 2020
Figure 1 for HyperGrid: Efficient Multi-Task Transformers with Grid-wise Decomposable Hyper Projections
Figure 2 for HyperGrid: Efficient Multi-Task Transformers with Grid-wise Decomposable Hyper Projections
Figure 3 for HyperGrid: Efficient Multi-Task Transformers with Grid-wise Decomposable Hyper Projections
Figure 4 for HyperGrid: Efficient Multi-Task Transformers with Grid-wise Decomposable Hyper Projections
Viaarxiv icon

Synthesizer: Rethinking Self-Attention in Transformer Models

Add code
May 02, 2020
Figure 1 for Synthesizer: Rethinking Self-Attention in Transformer Models
Figure 2 for Synthesizer: Rethinking Self-Attention in Transformer Models
Figure 3 for Synthesizer: Rethinking Self-Attention in Transformer Models
Figure 4 for Synthesizer: Rethinking Self-Attention in Transformer Models
Viaarxiv icon

Deep k-NN for Noisy Labels

Add code
Apr 26, 2020
Figure 1 for Deep k-NN for Noisy Labels
Figure 2 for Deep k-NN for Noisy Labels
Figure 3 for Deep k-NN for Noisy Labels
Figure 4 for Deep k-NN for Noisy Labels
Viaarxiv icon

Choppy: Cut Transformer For Ranked List Truncation

Add code
Apr 26, 2020
Figure 1 for Choppy: Cut Transformer For Ranked List Truncation
Figure 2 for Choppy: Cut Transformer For Ranked List Truncation
Figure 3 for Choppy: Cut Transformer For Ranked List Truncation
Viaarxiv icon

Reverse Engineering Configurations of Neural Text Generation Models

Add code
Apr 13, 2020
Figure 1 for Reverse Engineering Configurations of Neural Text Generation Models
Figure 2 for Reverse Engineering Configurations of Neural Text Generation Models
Figure 3 for Reverse Engineering Configurations of Neural Text Generation Models
Viaarxiv icon

Sparse Sinkhorn Attention

Add code
Feb 26, 2020
Figure 1 for Sparse Sinkhorn Attention
Figure 2 for Sparse Sinkhorn Attention
Figure 3 for Sparse Sinkhorn Attention
Figure 4 for Sparse Sinkhorn Attention
Viaarxiv icon