Picture for Zehuan Yuan

Zehuan Yuan

UniRef++: Segment Every Reference Object in Spatial and Temporal Spaces

Add code
Dec 25, 2023
Viaarxiv icon

General Object Foundation Model for Images and Videos at Scale

Add code
Dec 14, 2023
Figure 1 for General Object Foundation Model for Images and Videos at Scale
Figure 2 for General Object Foundation Model for Images and Videos at Scale
Figure 3 for General Object Foundation Model for Images and Videos at Scale
Figure 4 for General Object Foundation Model for Images and Videos at Scale
Viaarxiv icon

Recognize Any Regions

Add code
Nov 02, 2023
Viaarxiv icon

CoDet: Co-Occurrence Guided Region-Word Alignment for Open-Vocabulary Object Detection

Add code
Oct 25, 2023
Figure 1 for CoDet: Co-Occurrence Guided Region-Word Alignment for Open-Vocabulary Object Detection
Figure 2 for CoDet: Co-Occurrence Guided Region-Word Alignment for Open-Vocabulary Object Detection
Figure 3 for CoDet: Co-Occurrence Guided Region-Word Alignment for Open-Vocabulary Object Detection
Figure 4 for CoDet: Co-Occurrence Guided Region-Word Alignment for Open-Vocabulary Object Detection
Viaarxiv icon

EVE: Efficient Vision-Language Pre-training with Masked Prediction and Modality-Aware MoE

Add code
Aug 23, 2023
Figure 1 for EVE: Efficient Vision-Language Pre-training with Masked Prediction and Modality-Aware MoE
Figure 2 for EVE: Efficient Vision-Language Pre-training with Masked Prediction and Modality-Aware MoE
Figure 3 for EVE: Efficient Vision-Language Pre-training with Masked Prediction and Modality-Aware MoE
Figure 4 for EVE: Efficient Vision-Language Pre-training with Masked Prediction and Modality-Aware MoE
Viaarxiv icon

Exploring Transformers for Open-world Instance Segmentation

Add code
Aug 08, 2023
Figure 1 for Exploring Transformers for Open-world Instance Segmentation
Figure 2 for Exploring Transformers for Open-world Instance Segmentation
Figure 3 for Exploring Transformers for Open-world Instance Segmentation
Figure 4 for Exploring Transformers for Open-world Instance Segmentation
Viaarxiv icon

ChatBridge: Bridging Modalities with Large Language Model as a Language Catalyst

Add code
May 25, 2023
Figure 1 for ChatBridge: Bridging Modalities with Large Language Model as a Language Catalyst
Figure 2 for ChatBridge: Bridging Modalities with Large Language Model as a Language Catalyst
Figure 3 for ChatBridge: Bridging Modalities with Large Language Model as a Language Catalyst
Figure 4 for ChatBridge: Bridging Modalities with Large Language Model as a Language Catalyst
Viaarxiv icon

EGC: Image Generation and Classification via a Diffusion Energy-Based Model

Add code
Apr 13, 2023
Figure 1 for EGC: Image Generation and Classification via a Diffusion Energy-Based Model
Figure 2 for EGC: Image Generation and Classification via a Diffusion Energy-Based Model
Figure 3 for EGC: Image Generation and Classification via a Diffusion Energy-Based Model
Figure 4 for EGC: Image Generation and Classification via a Diffusion Energy-Based Model
Viaarxiv icon

Meta Compositional Referring Expression Segmentation

Add code
Apr 12, 2023
Figure 1 for Meta Compositional Referring Expression Segmentation
Figure 2 for Meta Compositional Referring Expression Segmentation
Figure 3 for Meta Compositional Referring Expression Segmentation
Figure 4 for Meta Compositional Referring Expression Segmentation
Viaarxiv icon

Token Boosting for Robust Self-Supervised Visual Transformer Pre-training

Add code
Apr 12, 2023
Figure 1 for Token Boosting for Robust Self-Supervised Visual Transformer Pre-training
Figure 2 for Token Boosting for Robust Self-Supervised Visual Transformer Pre-training
Figure 3 for Token Boosting for Robust Self-Supervised Visual Transformer Pre-training
Figure 4 for Token Boosting for Robust Self-Supervised Visual Transformer Pre-training
Viaarxiv icon