Picture for Yousong Zhu

Yousong Zhu

Griffon-G: Bridging Vision-Language and Vision-Centric Tasks via Large Multimodal Models

Add code
Oct 21, 2024
Figure 1 for Griffon-G: Bridging Vision-Language and Vision-Centric Tasks via Large Multimodal Models
Figure 2 for Griffon-G: Bridging Vision-Language and Vision-Centric Tasks via Large Multimodal Models
Figure 3 for Griffon-G: Bridging Vision-Language and Vision-Centric Tasks via Large Multimodal Models
Figure 4 for Griffon-G: Bridging Vision-Language and Vision-Centric Tasks via Large Multimodal Models
Viaarxiv icon

Griffon v2: Advancing Multimodal Perception with High-Resolution Scaling and Visual-Language Co-Referring

Add code
Mar 14, 2024
Viaarxiv icon

Mitigating Hallucination in Visual Language Models with Visual Supervision

Add code
Nov 27, 2023
Figure 1 for Mitigating Hallucination in Visual Language Models with Visual Supervision
Figure 2 for Mitigating Hallucination in Visual Language Models with Visual Supervision
Figure 3 for Mitigating Hallucination in Visual Language Models with Visual Supervision
Figure 4 for Mitigating Hallucination in Visual Language Models with Visual Supervision
Viaarxiv icon

Griffon: Spelling out All Object Locations at Any Granularity with Large Language Models

Add code
Nov 27, 2023
Viaarxiv icon

Efficient Masked Autoencoders with Self-Consistency

Add code
Feb 28, 2023
Viaarxiv icon

Masked Contrastive Pre-Training for Efficient Video-Text Retrieval

Add code
Dec 05, 2022
Figure 1 for Masked Contrastive Pre-Training for Efficient Video-Text Retrieval
Figure 2 for Masked Contrastive Pre-Training for Efficient Video-Text Retrieval
Figure 3 for Masked Contrastive Pre-Training for Efficient Video-Text Retrieval
Figure 4 for Masked Contrastive Pre-Training for Efficient Video-Text Retrieval
Viaarxiv icon

Exploring Stochastic Autoregressive Image Modeling for Visual Representation

Add code
Dec 03, 2022
Viaarxiv icon

Obj2Seq: Formatting Objects as Sequences with Class Prompt for Visual Tasks

Add code
Sep 28, 2022
Figure 1 for Obj2Seq: Formatting Objects as Sequences with Class Prompt for Visual Tasks
Figure 2 for Obj2Seq: Formatting Objects as Sequences with Class Prompt for Visual Tasks
Figure 3 for Obj2Seq: Formatting Objects as Sequences with Class Prompt for Visual Tasks
Figure 4 for Obj2Seq: Formatting Objects as Sequences with Class Prompt for Visual Tasks
Viaarxiv icon

UniVIP: A Unified Framework for Self-Supervised Visual Pre-training

Add code
Mar 14, 2022
Viaarxiv icon

Part-Aware Self-Supervised Pre-Training for Person Re-Identification

Add code
Mar 08, 2022
Figure 1 for Part-Aware Self-Supervised Pre-Training for Person Re-Identification
Figure 2 for Part-Aware Self-Supervised Pre-Training for Person Re-Identification
Figure 3 for Part-Aware Self-Supervised Pre-Training for Person Re-Identification
Figure 4 for Part-Aware Self-Supervised Pre-Training for Person Re-Identification
Viaarxiv icon