Picture for Alexander Kolesnikov

Alexander Kolesnikov

Dima

Sigmoid Loss for Language Image Pre-Training

Add code
Mar 30, 2023
Figure 1 for Sigmoid Loss for Language Image Pre-Training
Figure 2 for Sigmoid Loss for Language Image Pre-Training
Figure 3 for Sigmoid Loss for Language Image Pre-Training
Figure 4 for Sigmoid Loss for Language Image Pre-Training
Viaarxiv icon

Tuning computer vision models with task rewards

Add code
Feb 16, 2023
Figure 1 for Tuning computer vision models with task rewards
Figure 2 for Tuning computer vision models with task rewards
Figure 3 for Tuning computer vision models with task rewards
Figure 4 for Tuning computer vision models with task rewards
Viaarxiv icon

Scaling Vision Transformers to 22 Billion Parameters

Add code
Feb 10, 2023
Figure 1 for Scaling Vision Transformers to 22 Billion Parameters
Figure 2 for Scaling Vision Transformers to 22 Billion Parameters
Figure 3 for Scaling Vision Transformers to 22 Billion Parameters
Figure 4 for Scaling Vision Transformers to 22 Billion Parameters
Viaarxiv icon

FlexiViT: One Model for All Patch Sizes

Add code
Dec 15, 2022
Figure 1 for FlexiViT: One Model for All Patch Sizes
Figure 2 for FlexiViT: One Model for All Patch Sizes
Figure 3 for FlexiViT: One Model for All Patch Sizes
Figure 4 for FlexiViT: One Model for All Patch Sizes
Viaarxiv icon

PaLI: A Jointly-Scaled Multilingual Language-Image Model

Add code
Sep 16, 2022
Figure 1 for PaLI: A Jointly-Scaled Multilingual Language-Image Model
Figure 2 for PaLI: A Jointly-Scaled Multilingual Language-Image Model
Figure 3 for PaLI: A Jointly-Scaled Multilingual Language-Image Model
Figure 4 for PaLI: A Jointly-Scaled Multilingual Language-Image Model
Viaarxiv icon

UViM: A Unified Modeling Approach for Vision with Learned Guiding Codes

Add code
May 27, 2022
Figure 1 for UViM: A Unified Modeling Approach for Vision with Learned Guiding Codes
Figure 2 for UViM: A Unified Modeling Approach for Vision with Learned Guiding Codes
Figure 3 for UViM: A Unified Modeling Approach for Vision with Learned Guiding Codes
Figure 4 for UViM: A Unified Modeling Approach for Vision with Learned Guiding Codes
Viaarxiv icon

Better plain ViT baselines for ImageNet-1k

Add code
May 03, 2022
Figure 1 for Better plain ViT baselines for ImageNet-1k
Figure 2 for Better plain ViT baselines for ImageNet-1k
Figure 3 for Better plain ViT baselines for ImageNet-1k
Viaarxiv icon

LiT: Zero-Shot Transfer with Locked-image Text Tuning

Add code
Nov 15, 2021
Figure 1 for LiT: Zero-Shot Transfer with Locked-image Text Tuning
Figure 2 for LiT: Zero-Shot Transfer with Locked-image Text Tuning
Figure 3 for LiT: Zero-Shot Transfer with Locked-image Text Tuning
Figure 4 for LiT: Zero-Shot Transfer with Locked-image Text Tuning
Viaarxiv icon

How to train your ViT? Data, Augmentation, and Regularization in Vision Transformers

Add code
Jun 18, 2021
Figure 1 for How to train your ViT? Data, Augmentation, and Regularization in Vision Transformers
Figure 2 for How to train your ViT? Data, Augmentation, and Regularization in Vision Transformers
Figure 3 for How to train your ViT? Data, Augmentation, and Regularization in Vision Transformers
Figure 4 for How to train your ViT? Data, Augmentation, and Regularization in Vision Transformers
Viaarxiv icon

Knowledge distillation: A good teacher is patient and consistent

Add code
Jun 09, 2021
Figure 1 for Knowledge distillation: A good teacher is patient and consistent
Figure 2 for Knowledge distillation: A good teacher is patient and consistent
Figure 3 for Knowledge distillation: A good teacher is patient and consistent
Figure 4 for Knowledge distillation: A good teacher is patient and consistent
Viaarxiv icon