Picture for Yousong Zhu

Yousong Zhu

Griffon v2: Advancing Multimodal Perception with High-Resolution Scaling and Visual-Language Co-Referring

Add code
Mar 14, 2024
Figure 1 for Griffon v2: Advancing Multimodal Perception with High-Resolution Scaling and Visual-Language Co-Referring
Figure 2 for Griffon v2: Advancing Multimodal Perception with High-Resolution Scaling and Visual-Language Co-Referring
Figure 3 for Griffon v2: Advancing Multimodal Perception with High-Resolution Scaling and Visual-Language Co-Referring
Figure 4 for Griffon v2: Advancing Multimodal Perception with High-Resolution Scaling and Visual-Language Co-Referring
Viaarxiv icon

Griffon: Spelling out All Object Locations at Any Granularity with Large Language Models

Add code
Nov 27, 2023
Viaarxiv icon

Mitigating Hallucination in Visual Language Models with Visual Supervision

Add code
Nov 27, 2023
Viaarxiv icon

Efficient Masked Autoencoders with Self-Consistency

Add code
Feb 28, 2023
Figure 1 for Efficient Masked Autoencoders with Self-Consistency
Figure 2 for Efficient Masked Autoencoders with Self-Consistency
Figure 3 for Efficient Masked Autoencoders with Self-Consistency
Figure 4 for Efficient Masked Autoencoders with Self-Consistency
Viaarxiv icon

Masked Contrastive Pre-Training for Efficient Video-Text Retrieval

Add code
Dec 05, 2022
Figure 1 for Masked Contrastive Pre-Training for Efficient Video-Text Retrieval
Figure 2 for Masked Contrastive Pre-Training for Efficient Video-Text Retrieval
Figure 3 for Masked Contrastive Pre-Training for Efficient Video-Text Retrieval
Figure 4 for Masked Contrastive Pre-Training for Efficient Video-Text Retrieval
Viaarxiv icon

Exploring Stochastic Autoregressive Image Modeling for Visual Representation

Add code
Dec 03, 2022
Figure 1 for Exploring Stochastic Autoregressive Image Modeling for Visual Representation
Figure 2 for Exploring Stochastic Autoregressive Image Modeling for Visual Representation
Figure 3 for Exploring Stochastic Autoregressive Image Modeling for Visual Representation
Figure 4 for Exploring Stochastic Autoregressive Image Modeling for Visual Representation
Viaarxiv icon

Obj2Seq: Formatting Objects as Sequences with Class Prompt for Visual Tasks

Add code
Sep 28, 2022
Figure 1 for Obj2Seq: Formatting Objects as Sequences with Class Prompt for Visual Tasks
Figure 2 for Obj2Seq: Formatting Objects as Sequences with Class Prompt for Visual Tasks
Figure 3 for Obj2Seq: Formatting Objects as Sequences with Class Prompt for Visual Tasks
Figure 4 for Obj2Seq: Formatting Objects as Sequences with Class Prompt for Visual Tasks
Viaarxiv icon

UniVIP: A Unified Framework for Self-Supervised Visual Pre-training

Add code
Mar 14, 2022
Viaarxiv icon

Part-Aware Self-Supervised Pre-Training for Person Re-Identification

Add code
Mar 08, 2022
Figure 1 for Part-Aware Self-Supervised Pre-Training for Person Re-Identification
Figure 2 for Part-Aware Self-Supervised Pre-Training for Person Re-Identification
Figure 3 for Part-Aware Self-Supervised Pre-Training for Person Re-Identification
Figure 4 for Part-Aware Self-Supervised Pre-Training for Person Re-Identification
Viaarxiv icon

DPT: Deformable Patch-based Transformer for Visual Recognition

Add code
Jul 30, 2021
Figure 1 for DPT: Deformable Patch-based Transformer for Visual Recognition
Figure 2 for DPT: Deformable Patch-based Transformer for Visual Recognition
Figure 3 for DPT: Deformable Patch-based Transformer for Visual Recognition
Figure 4 for DPT: Deformable Patch-based Transformer for Visual Recognition
Viaarxiv icon