Picture for Weicheng Kuo

Weicheng Kuo

3D Open-Vocabulary Panoptic Segmentation with 2D-3D Vision-Language Distillation

Add code
Jan 04, 2024
Figure 1 for 3D Open-Vocabulary Panoptic Segmentation with 2D-3D Vision-Language Distillation
Figure 2 for 3D Open-Vocabulary Panoptic Segmentation with 2D-3D Vision-Language Distillation
Figure 3 for 3D Open-Vocabulary Panoptic Segmentation with 2D-3D Vision-Language Distillation
Figure 4 for 3D Open-Vocabulary Panoptic Segmentation with 2D-3D Vision-Language Distillation
Viaarxiv icon

Detection-Oriented Image-Text Pretraining for Open-Vocabulary Detection

Add code
Sep 29, 2023
Viaarxiv icon

Contrastive Feature Masking Open-Vocabulary Vision Transformer

Add code
Sep 02, 2023
Figure 1 for Contrastive Feature Masking Open-Vocabulary Vision Transformer
Figure 2 for Contrastive Feature Masking Open-Vocabulary Vision Transformer
Figure 3 for Contrastive Feature Masking Open-Vocabulary Vision Transformer
Figure 4 for Contrastive Feature Masking Open-Vocabulary Vision Transformer
Viaarxiv icon

DaTaSeg: Taming a Universal Multi-Dataset Multi-Task Segmentation Model

Add code
Jun 02, 2023
Figure 1 for DaTaSeg: Taming a Universal Multi-Dataset Multi-Task Segmentation Model
Figure 2 for DaTaSeg: Taming a Universal Multi-Dataset Multi-Task Segmentation Model
Figure 3 for DaTaSeg: Taming a Universal Multi-Dataset Multi-Task Segmentation Model
Figure 4 for DaTaSeg: Taming a Universal Multi-Dataset Multi-Task Segmentation Model
Viaarxiv icon

Region-Aware Pretraining for Open-Vocabulary Object Detection with Vision Transformers

Add code
May 11, 2023
Figure 1 for Region-Aware Pretraining for Open-Vocabulary Object Detection with Vision Transformers
Figure 2 for Region-Aware Pretraining for Open-Vocabulary Object Detection with Vision Transformers
Figure 3 for Region-Aware Pretraining for Open-Vocabulary Object Detection with Vision Transformers
Figure 4 for Region-Aware Pretraining for Open-Vocabulary Object Detection with Vision Transformers
Viaarxiv icon

RECLIP: Resource-efficient CLIP by Training with Small Images

Add code
Apr 12, 2023
Figure 1 for RECLIP: Resource-efficient CLIP by Training with Small Images
Figure 2 for RECLIP: Resource-efficient CLIP by Training with Small Images
Figure 3 for RECLIP: Resource-efficient CLIP by Training with Small Images
Figure 4 for RECLIP: Resource-efficient CLIP by Training with Small Images
Viaarxiv icon

MaMMUT: A Simple Architecture for Joint Learning for MultiModal Tasks

Add code
Mar 30, 2023
Figure 1 for MaMMUT: A Simple Architecture for Joint Learning for MultiModal Tasks
Figure 2 for MaMMUT: A Simple Architecture for Joint Learning for MultiModal Tasks
Figure 3 for MaMMUT: A Simple Architecture for Joint Learning for MultiModal Tasks
Figure 4 for MaMMUT: A Simple Architecture for Joint Learning for MultiModal Tasks
Viaarxiv icon

Rethinking Video ViTs: Sparse Video Tubes for Joint Image and Video Learning

Add code
Dec 06, 2022
Figure 1 for Rethinking Video ViTs: Sparse Video Tubes for Joint Image and Video Learning
Figure 2 for Rethinking Video ViTs: Sparse Video Tubes for Joint Image and Video Learning
Figure 3 for Rethinking Video ViTs: Sparse Video Tubes for Joint Image and Video Learning
Figure 4 for Rethinking Video ViTs: Sparse Video Tubes for Joint Image and Video Learning
Viaarxiv icon

F-VLM: Open-Vocabulary Object Detection upon Frozen Vision and Language Models

Add code
Sep 30, 2022
Figure 1 for F-VLM: Open-Vocabulary Object Detection upon Frozen Vision and Language Models
Figure 2 for F-VLM: Open-Vocabulary Object Detection upon Frozen Vision and Language Models
Figure 3 for F-VLM: Open-Vocabulary Object Detection upon Frozen Vision and Language Models
Figure 4 for F-VLM: Open-Vocabulary Object Detection upon Frozen Vision and Language Models
Viaarxiv icon

PaLI: A Jointly-Scaled Multilingual Language-Image Model

Add code
Sep 16, 2022
Figure 1 for PaLI: A Jointly-Scaled Multilingual Language-Image Model
Figure 2 for PaLI: A Jointly-Scaled Multilingual Language-Image Model
Figure 3 for PaLI: A Jointly-Scaled Multilingual Language-Image Model
Figure 4 for PaLI: A Jointly-Scaled Multilingual Language-Image Model
Viaarxiv icon