Picture for Chunyuan Li

Chunyuan Li

ELEVATER: A Benchmark and Toolkit for Evaluating Language-Augmented Visual Models

Add code
Apr 20, 2022
Figure 1 for ELEVATER: A Benchmark and Toolkit for Evaluating Language-Augmented Visual Models
Figure 2 for ELEVATER: A Benchmark and Toolkit for Evaluating Language-Augmented Visual Models
Figure 3 for ELEVATER: A Benchmark and Toolkit for Evaluating Language-Augmented Visual Models
Figure 4 for ELEVATER: A Benchmark and Toolkit for Evaluating Language-Augmented Visual Models
Viaarxiv icon

K-LITE: Learning Transferable Visual Models with External Knowledge

Add code
Apr 20, 2022
Figure 1 for K-LITE: Learning Transferable Visual Models with External Knowledge
Figure 2 for K-LITE: Learning Transferable Visual Models with External Knowledge
Figure 3 for K-LITE: Learning Transferable Visual Models with External Knowledge
Figure 4 for K-LITE: Learning Transferable Visual Models with External Knowledge
Viaarxiv icon

Unified Contrastive Learning in Image-Text-Label Space

Add code
Apr 07, 2022
Figure 1 for Unified Contrastive Learning in Image-Text-Label Space
Figure 2 for Unified Contrastive Learning in Image-Text-Label Space
Figure 3 for Unified Contrastive Learning in Image-Text-Label Space
Figure 4 for Unified Contrastive Learning in Image-Text-Label Space
Viaarxiv icon

Parameter-efficient Fine-tuning for Vision Transformers

Add code
Mar 29, 2022
Figure 1 for Parameter-efficient Fine-tuning for Vision Transformers
Figure 2 for Parameter-efficient Fine-tuning for Vision Transformers
Figure 3 for Parameter-efficient Fine-tuning for Vision Transformers
Figure 4 for Parameter-efficient Fine-tuning for Vision Transformers
Viaarxiv icon

Focal Modulation Networks

Add code
Mar 22, 2022
Figure 1 for Focal Modulation Networks
Figure 2 for Focal Modulation Networks
Figure 3 for Focal Modulation Networks
Figure 4 for Focal Modulation Networks
Viaarxiv icon

RegionCLIP: Region-based Language-Image Pretraining

Add code
Dec 16, 2021
Figure 1 for RegionCLIP: Region-based Language-Image Pretraining
Figure 2 for RegionCLIP: Region-based Language-Image Pretraining
Figure 3 for RegionCLIP: Region-based Language-Image Pretraining
Figure 4 for RegionCLIP: Region-based Language-Image Pretraining
Viaarxiv icon

LAFITE: Towards Language-Free Training for Text-to-Image Generation

Add code
Dec 13, 2021
Figure 1 for LAFITE: Towards Language-Free Training for Text-to-Image Generation
Figure 2 for LAFITE: Towards Language-Free Training for Text-to-Image Generation
Figure 3 for LAFITE: Towards Language-Free Training for Text-to-Image Generation
Figure 4 for LAFITE: Towards Language-Free Training for Text-to-Image Generation
Viaarxiv icon

Grounded Language-Image Pre-training

Add code
Dec 07, 2021
Figure 1 for Grounded Language-Image Pre-training
Figure 2 for Grounded Language-Image Pre-training
Figure 3 for Grounded Language-Image Pre-training
Figure 4 for Grounded Language-Image Pre-training
Viaarxiv icon

A Generic Approach for Enhancing GANs by Regularized Latent Optimization

Add code
Dec 07, 2021
Figure 1 for A Generic Approach for Enhancing GANs by Regularized Latent Optimization
Figure 2 for A Generic Approach for Enhancing GANs by Regularized Latent Optimization
Figure 3 for A Generic Approach for Enhancing GANs by Regularized Latent Optimization
Figure 4 for A Generic Approach for Enhancing GANs by Regularized Latent Optimization
Viaarxiv icon

Florence: A New Foundation Model for Computer Vision

Add code
Nov 22, 2021
Figure 1 for Florence: A New Foundation Model for Computer Vision
Figure 2 for Florence: A New Foundation Model for Computer Vision
Figure 3 for Florence: A New Foundation Model for Computer Vision
Figure 4 for Florence: A New Foundation Model for Computer Vision
Viaarxiv icon