Picture for Zhe Gan

Zhe Gan

Multimodal Foundation Models: From Specialists to General-Purpose Assistants

Add code
Sep 18, 2023
Viaarxiv icon

MOFI: Learning Image Representations from Noisy Entity Annotated Images

Add code
Jun 24, 2023
Viaarxiv icon

An Empirical Study of Multimodal Model Merging

Add code
Apr 28, 2023
Viaarxiv icon

Diagnostic Benchmark and Iterative Inpainting for Layout-Guided Image Generation

Add code
Apr 14, 2023
Figure 1 for Diagnostic Benchmark and Iterative Inpainting for Layout-Guided Image Generation
Figure 2 for Diagnostic Benchmark and Iterative Inpainting for Layout-Guided Image Generation
Figure 3 for Diagnostic Benchmark and Iterative Inpainting for Layout-Guided Image Generation
Figure 4 for Diagnostic Benchmark and Iterative Inpainting for Layout-Guided Image Generation
Viaarxiv icon

Generalized Decoding for Pixel, Image, and Language

Add code
Dec 21, 2022
Figure 1 for Generalized Decoding for Pixel, Image, and Language
Figure 2 for Generalized Decoding for Pixel, Image, and Language
Figure 3 for Generalized Decoding for Pixel, Image, and Language
Figure 4 for Generalized Decoding for Pixel, Image, and Language
Viaarxiv icon

Exploring Discrete Diffusion Models for Image Captioning

Add code
Dec 09, 2022
Figure 1 for Exploring Discrete Diffusion Models for Image Captioning
Figure 2 for Exploring Discrete Diffusion Models for Image Captioning
Figure 3 for Exploring Discrete Diffusion Models for Image Captioning
Figure 4 for Exploring Discrete Diffusion Models for Image Captioning
Viaarxiv icon

GRiT: A Generative Region-to-text Transformer for Object Understanding

Add code
Dec 01, 2022
Viaarxiv icon

ReCo: Region-Controlled Text-to-Image Generation

Add code
Nov 23, 2022
Viaarxiv icon

Non-Contrastive Learning Meets Language-Image Pre-Training

Add code
Oct 17, 2022
Figure 1 for Non-Contrastive Learning Meets Language-Image Pre-Training
Figure 2 for Non-Contrastive Learning Meets Language-Image Pre-Training
Figure 3 for Non-Contrastive Learning Meets Language-Image Pre-Training
Figure 4 for Non-Contrastive Learning Meets Language-Image Pre-Training
Viaarxiv icon

Vision-Language Pre-training: Basics, Recent Advances, and Future Trends

Add code
Oct 17, 2022
Figure 1 for Vision-Language Pre-training: Basics, Recent Advances, and Future Trends
Figure 2 for Vision-Language Pre-training: Basics, Recent Advances, and Future Trends
Figure 3 for Vision-Language Pre-training: Basics, Recent Advances, and Future Trends
Figure 4 for Vision-Language Pre-training: Basics, Recent Advances, and Future Trends
Viaarxiv icon