Alert button
Picture for Mostafa Dehghani

Mostafa Dehghani

Alert button

PolyViT: Co-training Vision Transformers on Images, Videos and Audio

Nov 25, 2021
Valerii Likhosherstov, Anurag Arnab, Krzysztof Choromanski, Mario Lucic, Yi Tay, Adrian Weller, Mostafa Dehghani

Figure 1 for PolyViT: Co-training Vision Transformers on Images, Videos and Audio
Figure 2 for PolyViT: Co-training Vision Transformers on Images, Videos and Audio
Figure 3 for PolyViT: Co-training Vision Transformers on Images, Videos and Audio
Figure 4 for PolyViT: Co-training Vision Transformers on Images, Videos and Audio
Viaarxiv icon

Discrete Representations Strengthen Vision Transformer Robustness

Nov 20, 2021
Chengzhi Mao, Lu Jiang, Mostafa Dehghani, Carl Vondrick, Rahul Sukthankar, Irfan Essa

Figure 1 for Discrete Representations Strengthen Vision Transformer Robustness
Figure 2 for Discrete Representations Strengthen Vision Transformer Robustness
Figure 3 for Discrete Representations Strengthen Vision Transformer Robustness
Figure 4 for Discrete Representations Strengthen Vision Transformer Robustness
Viaarxiv icon

The Efficiency Misnomer

Oct 25, 2021
Mostafa Dehghani, Anurag Arnab, Lucas Beyer, Ashish Vaswani, Yi Tay

Figure 1 for The Efficiency Misnomer
Figure 2 for The Efficiency Misnomer
Figure 3 for The Efficiency Misnomer
Viaarxiv icon

SCENIC: A JAX Library for Computer Vision Research and Beyond

Oct 18, 2021
Mostafa Dehghani, Alexey Gritsenko, Anurag Arnab, Matthias Minderer, Yi Tay

Figure 1 for SCENIC: A JAX Library for Computer Vision Research and Beyond
Viaarxiv icon

Exploring the Limits of Large Scale Pre-training

Oct 05, 2021
Samira Abnar, Mostafa Dehghani, Behnam Neyshabur, Hanie Sedghi

Figure 1 for Exploring the Limits of Large Scale Pre-training
Figure 2 for Exploring the Limits of Large Scale Pre-training
Figure 3 for Exploring the Limits of Large Scale Pre-training
Figure 4 for Exploring the Limits of Large Scale Pre-training
Viaarxiv icon

Scale Efficiently: Insights from Pre-training and Fine-tuning Transformers

Sep 22, 2021
Yi Tay, Mostafa Dehghani, Jinfeng Rao, William Fedus, Samira Abnar, Hyung Won Chung, Sharan Narang, Dani Yogatama, Ashish Vaswani, Donald Metzler

Figure 1 for Scale Efficiently: Insights from Pre-training and Fine-tuning Transformers
Figure 2 for Scale Efficiently: Insights from Pre-training and Fine-tuning Transformers
Figure 3 for Scale Efficiently: Insights from Pre-training and Fine-tuning Transformers
Figure 4 for Scale Efficiently: Insights from Pre-training and Fine-tuning Transformers
Viaarxiv icon

The Benchmark Lottery

Jul 14, 2021
Mostafa Dehghani, Yi Tay, Alexey A. Gritsenko, Zhe Zhao, Neil Houlsby, Fernando Diaz, Donald Metzler, Oriol Vinyals

Figure 1 for The Benchmark Lottery
Figure 2 for The Benchmark Lottery
Figure 3 for The Benchmark Lottery
Figure 4 for The Benchmark Lottery
Viaarxiv icon

TokenLearner: What Can 8 Learned Tokens Do for Images and Videos?

Jun 21, 2021
Michael S. Ryoo, AJ Piergiovanni, Anurag Arnab, Mostafa Dehghani, Anelia Angelova

Figure 1 for TokenLearner: What Can 8 Learned Tokens Do for Images and Videos?
Figure 2 for TokenLearner: What Can 8 Learned Tokens Do for Images and Videos?
Figure 3 for TokenLearner: What Can 8 Learned Tokens Do for Images and Videos?
Figure 4 for TokenLearner: What Can 8 Learned Tokens Do for Images and Videos?
Viaarxiv icon

Gradual Domain Adaptation in the Wild:When Intermediate Distributions are Absent

Jun 10, 2021
Samira Abnar, Rianne van den Berg, Golnaz Ghiasi, Mostafa Dehghani, Nal Kalchbrenner, Hanie Sedghi

Figure 1 for Gradual Domain Adaptation in the Wild:When Intermediate Distributions are Absent
Figure 2 for Gradual Domain Adaptation in the Wild:When Intermediate Distributions are Absent
Figure 3 for Gradual Domain Adaptation in the Wild:When Intermediate Distributions are Absent
Figure 4 for Gradual Domain Adaptation in the Wild:When Intermediate Distributions are Absent
Viaarxiv icon

Parameter-efficient Multi-task Fine-tuning for Transformers via Shared Hypernetworks

Jun 08, 2021
Rabeeh Karimi Mahabadi, Sebastian Ruder, Mostafa Dehghani, James Henderson

Figure 1 for Parameter-efficient Multi-task Fine-tuning for Transformers via Shared Hypernetworks
Figure 2 for Parameter-efficient Multi-task Fine-tuning for Transformers via Shared Hypernetworks
Figure 3 for Parameter-efficient Multi-task Fine-tuning for Transformers via Shared Hypernetworks
Figure 4 for Parameter-efficient Multi-task Fine-tuning for Transformers via Shared Hypernetworks
Viaarxiv icon