Picture for Zhiyang Chen

Zhiyang Chen

Openstory++: A Large-scale Dataset and Benchmark for Instance-aware Open-domain Visual Storytelling

Add code
Aug 07, 2024
Figure 1 for Openstory++: A Large-scale Dataset and Benchmark for Instance-aware Open-domain Visual Storytelling
Figure 2 for Openstory++: A Large-scale Dataset and Benchmark for Instance-aware Open-domain Visual Storytelling
Figure 3 for Openstory++: A Large-scale Dataset and Benchmark for Instance-aware Open-domain Visual Storytelling
Figure 4 for Openstory++: A Large-scale Dataset and Benchmark for Instance-aware Open-domain Visual Storytelling
Viaarxiv icon

Griffon: Spelling out All Object Locations at Any Granularity with Large Language Models

Add code
Nov 27, 2023
Viaarxiv icon

Mitigating Hallucination in Visual Language Models with Visual Supervision

Add code
Nov 27, 2023
Viaarxiv icon

Efficient Masked Autoencoders with Self-Consistency

Add code
Feb 28, 2023
Figure 1 for Efficient Masked Autoencoders with Self-Consistency
Figure 2 for Efficient Masked Autoencoders with Self-Consistency
Figure 3 for Efficient Masked Autoencoders with Self-Consistency
Figure 4 for Efficient Masked Autoencoders with Self-Consistency
Viaarxiv icon

Obj2Seq: Formatting Objects as Sequences with Class Prompt for Visual Tasks

Add code
Sep 28, 2022
Figure 1 for Obj2Seq: Formatting Objects as Sequences with Class Prompt for Visual Tasks
Figure 2 for Obj2Seq: Formatting Objects as Sequences with Class Prompt for Visual Tasks
Figure 3 for Obj2Seq: Formatting Objects as Sequences with Class Prompt for Visual Tasks
Figure 4 for Obj2Seq: Formatting Objects as Sequences with Class Prompt for Visual Tasks
Viaarxiv icon

UniVIP: A Unified Framework for Self-Supervised Visual Pre-training

Add code
Mar 14, 2022
Viaarxiv icon

DPT: Deformable Patch-based Transformer for Visual Recognition

Add code
Jul 30, 2021
Figure 1 for DPT: Deformable Patch-based Transformer for Visual Recognition
Figure 2 for DPT: Deformable Patch-based Transformer for Visual Recognition
Figure 3 for DPT: Deformable Patch-based Transformer for Visual Recognition
Figure 4 for DPT: Deformable Patch-based Transformer for Visual Recognition
Viaarxiv icon

MST: Masked Self-Supervised Transformer for Visual Representation

Add code
Jun 10, 2021
Figure 1 for MST: Masked Self-Supervised Transformer for Visual Representation
Figure 2 for MST: Masked Self-Supervised Transformer for Visual Representation
Figure 3 for MST: Masked Self-Supervised Transformer for Visual Representation
Figure 4 for MST: Masked Self-Supervised Transformer for Visual Representation
Viaarxiv icon