Picture for Fartash Faghri

Fartash Faghri

CLIP with Quality Captions: A Strong Pretraining for Vision Tasks

Add code
May 14, 2024
Figure 1 for CLIP with Quality Captions: A Strong Pretraining for Vision Tasks
Figure 2 for CLIP with Quality Captions: A Strong Pretraining for Vision Tasks
Figure 3 for CLIP with Quality Captions: A Strong Pretraining for Vision Tasks
Figure 4 for CLIP with Quality Captions: A Strong Pretraining for Vision Tasks
Viaarxiv icon

CatLIP: CLIP-level Visual Recognition Accuracy with 2.7x Faster Pre-training on Web-scale Image-Text Data

Add code
Apr 24, 2024
Figure 1 for CatLIP: CLIP-level Visual Recognition Accuracy with 2.7x Faster Pre-training on Web-scale Image-Text Data
Figure 2 for CatLIP: CLIP-level Visual Recognition Accuracy with 2.7x Faster Pre-training on Web-scale Image-Text Data
Figure 3 for CatLIP: CLIP-level Visual Recognition Accuracy with 2.7x Faster Pre-training on Web-scale Image-Text Data
Figure 4 for CatLIP: CLIP-level Visual Recognition Accuracy with 2.7x Faster Pre-training on Web-scale Image-Text Data
Viaarxiv icon

Weight subcloning: direct initialization of transformers using larger pretrained ones

Add code
Dec 14, 2023
Figure 1 for Weight subcloning: direct initialization of transformers using larger pretrained ones
Figure 2 for Weight subcloning: direct initialization of transformers using larger pretrained ones
Figure 3 for Weight subcloning: direct initialization of transformers using larger pretrained ones
Figure 4 for Weight subcloning: direct initialization of transformers using larger pretrained ones
Viaarxiv icon

Label-efficient Training of Small Task-specific Models by Leveraging Vision Foundation Models

Add code
Nov 30, 2023
Figure 1 for Label-efficient Training of Small Task-specific Models by Leveraging Vision Foundation Models
Figure 2 for Label-efficient Training of Small Task-specific Models by Leveraging Vision Foundation Models
Figure 3 for Label-efficient Training of Small Task-specific Models by Leveraging Vision Foundation Models
Figure 4 for Label-efficient Training of Small Task-specific Models by Leveraging Vision Foundation Models
Viaarxiv icon

MobileCLIP: Fast Image-Text Models through Multi-Modal Reinforced Training

Add code
Nov 28, 2023
Figure 1 for MobileCLIP: Fast Image-Text Models through Multi-Modal Reinforced Training
Figure 2 for MobileCLIP: Fast Image-Text Models through Multi-Modal Reinforced Training
Figure 3 for MobileCLIP: Fast Image-Text Models through Multi-Modal Reinforced Training
Figure 4 for MobileCLIP: Fast Image-Text Models through Multi-Modal Reinforced Training
Viaarxiv icon

TiC-CLIP: Continual Training of CLIP Models

Add code
Oct 24, 2023
Figure 1 for TiC-CLIP: Continual Training of CLIP Models
Figure 2 for TiC-CLIP: Continual Training of CLIP Models
Figure 3 for TiC-CLIP: Continual Training of CLIP Models
Figure 4 for TiC-CLIP: Continual Training of CLIP Models
Viaarxiv icon

SAM-CLIP: Merging Vision Foundation Models towards Semantic and Spatial Understanding

Add code
Oct 23, 2023
Figure 1 for SAM-CLIP: Merging Vision Foundation Models towards Semantic and Spatial Understanding
Figure 2 for SAM-CLIP: Merging Vision Foundation Models towards Semantic and Spatial Understanding
Figure 3 for SAM-CLIP: Merging Vision Foundation Models towards Semantic and Spatial Understanding
Figure 4 for SAM-CLIP: Merging Vision Foundation Models towards Semantic and Spatial Understanding
Viaarxiv icon

CLIP meets Model Zoo Experts: Pseudo-Supervision for Visual Enhancement

Add code
Oct 21, 2023
Viaarxiv icon

Reinforce Data, Multiply Impact: Improved Model Accuracy and Robustness with Dataset Reinforcement

Add code
Mar 15, 2023
Viaarxiv icon

FastFill: Efficient Compatible Model Update

Add code
Mar 08, 2023
Viaarxiv icon