Picture for Weiyun Wang

Weiyun Wang

OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text

Add code
Jun 13, 2024
Figure 1 for OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text
Figure 2 for OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text
Figure 3 for OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text
Figure 4 for OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text
Viaarxiv icon

OmniCorpus: An Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text

Add code
Jun 12, 2024
Figure 1 for OmniCorpus: An Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text
Figure 2 for OmniCorpus: An Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text
Figure 3 for OmniCorpus: An Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text
Figure 4 for OmniCorpus: An Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text
Viaarxiv icon

Needle In A Multimodal Haystack

Add code
Jun 11, 2024
Viaarxiv icon

How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites

Add code
Apr 29, 2024
Figure 1 for How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites
Figure 2 for How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites
Figure 3 for How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites
Figure 4 for How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites
Viaarxiv icon

Vision-RWKV: Efficient and Scalable Visual Perception with RWKV-Like Architectures

Add code
Mar 07, 2024
Figure 1 for Vision-RWKV: Efficient and Scalable Visual Perception with RWKV-Like Architectures
Figure 2 for Vision-RWKV: Efficient and Scalable Visual Perception with RWKV-Like Architectures
Figure 3 for Vision-RWKV: Efficient and Scalable Visual Perception with RWKV-Like Architectures
Figure 4 for Vision-RWKV: Efficient and Scalable Visual Perception with RWKV-Like Architectures
Viaarxiv icon

The All-Seeing Project V2: Towards General Relation Comprehension of the Open World

Add code
Feb 29, 2024
Figure 1 for The All-Seeing Project V2: Towards General Relation Comprehension of the Open World
Figure 2 for The All-Seeing Project V2: Towards General Relation Comprehension of the Open World
Figure 3 for The All-Seeing Project V2: Towards General Relation Comprehension of the Open World
Figure 4 for The All-Seeing Project V2: Towards General Relation Comprehension of the Open World
Viaarxiv icon

MM-Interleaved: Interleaved Image-Text Generative Modeling via Multi-modal Feature Synchronizer

Add code
Jan 18, 2024
Viaarxiv icon

The All-Seeing Project: Towards Panoptic Visual Recognition and Understanding of the Open World

Add code
Aug 03, 2023
Figure 1 for The All-Seeing Project: Towards Panoptic Visual Recognition and Understanding of the Open World
Figure 2 for The All-Seeing Project: Towards Panoptic Visual Recognition and Understanding of the Open World
Figure 3 for The All-Seeing Project: Towards Panoptic Visual Recognition and Understanding of the Open World
Figure 4 for The All-Seeing Project: Towards Panoptic Visual Recognition and Understanding of the Open World
Viaarxiv icon

InternGPT: Solving Vision-Centric Tasks by Interacting with ChatGPT Beyond Language

Add code
May 11, 2023
Figure 1 for InternGPT: Solving Vision-Centric Tasks by Interacting with ChatGPT Beyond Language
Figure 2 for InternGPT: Solving Vision-Centric Tasks by Interacting with ChatGPT Beyond Language
Figure 3 for InternGPT: Solving Vision-Centric Tasks by Interacting with ChatGPT Beyond Language
Figure 4 for InternGPT: Solving Vision-Centric Tasks by Interacting with ChatGPT Beyond Language
Viaarxiv icon

Demystify Transformers & Convolutions in Modern Image Deep Networks

Add code
Nov 10, 2022
Figure 1 for Demystify Transformers & Convolutions in Modern Image Deep Networks
Figure 2 for Demystify Transformers & Convolutions in Modern Image Deep Networks
Figure 3 for Demystify Transformers & Convolutions in Modern Image Deep Networks
Figure 4 for Demystify Transformers & Convolutions in Modern Image Deep Networks
Viaarxiv icon