Alert button
Picture for Alexander Kolesnikov

Alexander Kolesnikov

Alert button

FlexiViT: One Model for All Patch Sizes

Dec 15, 2022
Lucas Beyer, Pavel Izmailov, Alexander Kolesnikov, Mathilde Caron, Simon Kornblith, Xiaohua Zhai, Matthias Minderer, Michael Tschannen, Ibrahim Alabdulmohsin, Filip Pavetic

Figure 1 for FlexiViT: One Model for All Patch Sizes
Figure 2 for FlexiViT: One Model for All Patch Sizes
Figure 3 for FlexiViT: One Model for All Patch Sizes
Figure 4 for FlexiViT: One Model for All Patch Sizes
Viaarxiv icon

PaLI: A Jointly-Scaled Multilingual Language-Image Model

Sep 16, 2022
Xi Chen, Xiao Wang, Soravit Changpinyo, AJ Piergiovanni, Piotr Padlewski, Daniel Salz, Sebastian Goodman, Adam Grycner, Basil Mustafa, Lucas Beyer, Alexander Kolesnikov, Joan Puigcerver, Nan Ding, Keran Rong, Hassan Akbari, Gaurav Mishra, Linting Xue, Ashish Thapliyal, James Bradbury, Weicheng Kuo, Mojtaba Seyedhosseini, Chao Jia, Burcu Karagol Ayan, Carlos Riquelme, Andreas Steiner, Anelia Angelova, Xiaohua Zhai, Neil Houlsby, Radu Soricut

Figure 1 for PaLI: A Jointly-Scaled Multilingual Language-Image Model
Figure 2 for PaLI: A Jointly-Scaled Multilingual Language-Image Model
Figure 3 for PaLI: A Jointly-Scaled Multilingual Language-Image Model
Figure 4 for PaLI: A Jointly-Scaled Multilingual Language-Image Model
Viaarxiv icon

UViM: A Unified Modeling Approach for Vision with Learned Guiding Codes

May 27, 2022
Alexander Kolesnikov, André Susano Pinto, Lucas Beyer, Xiaohua Zhai, Jeremiah Harmsen, Neil Houlsby

Figure 1 for UViM: A Unified Modeling Approach for Vision with Learned Guiding Codes
Figure 2 for UViM: A Unified Modeling Approach for Vision with Learned Guiding Codes
Figure 3 for UViM: A Unified Modeling Approach for Vision with Learned Guiding Codes
Figure 4 for UViM: A Unified Modeling Approach for Vision with Learned Guiding Codes
Viaarxiv icon

Better plain ViT baselines for ImageNet-1k

May 03, 2022
Lucas Beyer, Xiaohua Zhai, Alexander Kolesnikov

Figure 1 for Better plain ViT baselines for ImageNet-1k
Figure 2 for Better plain ViT baselines for ImageNet-1k
Figure 3 for Better plain ViT baselines for ImageNet-1k
Viaarxiv icon

LiT: Zero-Shot Transfer with Locked-image Text Tuning

Nov 15, 2021
Xiaohua Zhai, Xiao Wang, Basil Mustafa, Andreas Steiner, Daniel Keysers, Alexander Kolesnikov, Lucas Beyer

Figure 1 for LiT: Zero-Shot Transfer with Locked-image Text Tuning
Figure 2 for LiT: Zero-Shot Transfer with Locked-image Text Tuning
Figure 3 for LiT: Zero-Shot Transfer with Locked-image Text Tuning
Figure 4 for LiT: Zero-Shot Transfer with Locked-image Text Tuning
Viaarxiv icon

How to train your ViT? Data, Augmentation, and Regularization in Vision Transformers

Jun 18, 2021
Andreas Steiner, Alexander Kolesnikov, Xiaohua Zhai, Ross Wightman, Jakob Uszkoreit, Lucas Beyer

Figure 1 for How to train your ViT? Data, Augmentation, and Regularization in Vision Transformers
Figure 2 for How to train your ViT? Data, Augmentation, and Regularization in Vision Transformers
Figure 3 for How to train your ViT? Data, Augmentation, and Regularization in Vision Transformers
Figure 4 for How to train your ViT? Data, Augmentation, and Regularization in Vision Transformers
Viaarxiv icon

Knowledge distillation: A good teacher is patient and consistent

Jun 09, 2021
Lucas Beyer, Xiaohua Zhai, Amélie Royer, Larisa Markeeva, Rohan Anil, Alexander Kolesnikov

Figure 1 for Knowledge distillation: A good teacher is patient and consistent
Figure 2 for Knowledge distillation: A good teacher is patient and consistent
Figure 3 for Knowledge distillation: A good teacher is patient and consistent
Figure 4 for Knowledge distillation: A good teacher is patient and consistent
Viaarxiv icon

Scaling Vision Transformers

Jun 08, 2021
Xiaohua Zhai, Alexander Kolesnikov, Neil Houlsby, Lucas Beyer

Figure 1 for Scaling Vision Transformers
Figure 2 for Scaling Vision Transformers
Figure 3 for Scaling Vision Transformers
Figure 4 for Scaling Vision Transformers
Viaarxiv icon

MLP-Mixer: An all-MLP Architecture for Vision

May 17, 2021
Ilya Tolstikhin, Neil Houlsby, Alexander Kolesnikov, Lucas Beyer, Xiaohua Zhai, Thomas Unterthiner, Jessica Yung, Andreas Steiner, Daniel Keysers, Jakob Uszkoreit, Mario Lucic, Alexey Dosovitskiy

Figure 1 for MLP-Mixer: An all-MLP Architecture for Vision
Figure 2 for MLP-Mixer: An all-MLP Architecture for Vision
Figure 3 for MLP-Mixer: An all-MLP Architecture for Vision
Figure 4 for MLP-Mixer: An all-MLP Architecture for Vision
Viaarxiv icon