Picture for Lu Yuan

Lu Yuan

Stephen

GLIPv2: Unifying Localization and Vision-Language Understanding

Add code
Jun 12, 2022
Figure 1 for GLIPv2: Unifying Localization and Vision-Language Understanding
Figure 2 for GLIPv2: Unifying Localization and Vision-Language Understanding
Figure 3 for GLIPv2: Unifying Localization and Vision-Language Understanding
Figure 4 for GLIPv2: Unifying Localization and Vision-Language Understanding
Viaarxiv icon

Detection Hub: Unifying Object Detection Datasets via Query Adaptation on Language Embedding

Add code
Jun 07, 2022
Figure 1 for Detection Hub: Unifying Object Detection Datasets via Query Adaptation on Language Embedding
Figure 2 for Detection Hub: Unifying Object Detection Datasets via Query Adaptation on Language Embedding
Figure 3 for Detection Hub: Unifying Object Detection Datasets via Query Adaptation on Language Embedding
Figure 4 for Detection Hub: Unifying Object Detection Datasets via Query Adaptation on Language Embedding
Viaarxiv icon

Visual Clues: Bridging Vision and Language Foundations for Image Paragraph Captioning

Add code
Jun 03, 2022
Figure 1 for Visual Clues: Bridging Vision and Language Foundations for Image Paragraph Captioning
Figure 2 for Visual Clues: Bridging Vision and Language Foundations for Image Paragraph Captioning
Figure 3 for Visual Clues: Bridging Vision and Language Foundations for Image Paragraph Captioning
Figure 4 for Visual Clues: Bridging Vision and Language Foundations for Image Paragraph Captioning
Viaarxiv icon

REVIVE: Regional Visual Representation Matters in Knowledge-Based Visual Question Answering

Add code
Jun 02, 2022
Figure 1 for REVIVE: Regional Visual Representation Matters in Knowledge-Based Visual Question Answering
Figure 2 for REVIVE: Regional Visual Representation Matters in Knowledge-Based Visual Question Answering
Figure 3 for REVIVE: Regional Visual Representation Matters in Knowledge-Based Visual Question Answering
Figure 4 for REVIVE: Regional Visual Representation Matters in Knowledge-Based Visual Question Answering
Viaarxiv icon

Reduce Information Loss in Transformers for Pluralistic Image Inpainting

Add code
May 15, 2022
Figure 1 for Reduce Information Loss in Transformers for Pluralistic Image Inpainting
Figure 2 for Reduce Information Loss in Transformers for Pluralistic Image Inpainting
Figure 3 for Reduce Information Loss in Transformers for Pluralistic Image Inpainting
Figure 4 for Reduce Information Loss in Transformers for Pluralistic Image Inpainting
Viaarxiv icon

i-Code: An Integrative and Composable Multimodal Learning Framework

Add code
May 05, 2022
Figure 1 for i-Code: An Integrative and Composable Multimodal Learning Framework
Figure 2 for i-Code: An Integrative and Composable Multimodal Learning Framework
Figure 3 for i-Code: An Integrative and Composable Multimodal Learning Framework
Figure 4 for i-Code: An Integrative and Composable Multimodal Learning Framework
Viaarxiv icon

Multimodal Adaptive Distillation for Leveraging Unimodal Encoders for Vision-Language Tasks

Add code
Apr 28, 2022
Figure 1 for Multimodal Adaptive Distillation for Leveraging Unimodal Encoders for Vision-Language Tasks
Figure 2 for Multimodal Adaptive Distillation for Leveraging Unimodal Encoders for Vision-Language Tasks
Figure 3 for Multimodal Adaptive Distillation for Leveraging Unimodal Encoders for Vision-Language Tasks
Figure 4 for Multimodal Adaptive Distillation for Leveraging Unimodal Encoders for Vision-Language Tasks
Viaarxiv icon

Residual Mixture of Experts

Add code
Apr 20, 2022
Figure 1 for Residual Mixture of Experts
Figure 2 for Residual Mixture of Experts
Figure 3 for Residual Mixture of Experts
Figure 4 for Residual Mixture of Experts
Viaarxiv icon

K-LITE: Learning Transferable Visual Models with External Knowledge

Add code
Apr 20, 2022
Figure 1 for K-LITE: Learning Transferable Visual Models with External Knowledge
Figure 2 for K-LITE: Learning Transferable Visual Models with External Knowledge
Figure 3 for K-LITE: Learning Transferable Visual Models with External Knowledge
Figure 4 for K-LITE: Learning Transferable Visual Models with External Knowledge
Viaarxiv icon

MiniViT: Compressing Vision Transformers with Weight Multiplexing

Add code
Apr 14, 2022
Figure 1 for MiniViT: Compressing Vision Transformers with Weight Multiplexing
Figure 2 for MiniViT: Compressing Vision Transformers with Weight Multiplexing
Figure 3 for MiniViT: Compressing Vision Transformers with Weight Multiplexing
Figure 4 for MiniViT: Compressing Vision Transformers with Weight Multiplexing
Viaarxiv icon