Picture for Jianfeng Wang

Jianfeng Wang

Tony

Segment and Caption Anything

Add code
Dec 01, 2023
Figure 1 for Segment and Caption Anything
Figure 2 for Segment and Caption Anything
Figure 3 for Segment and Caption Anything
Figure 4 for Segment and Caption Anything
Viaarxiv icon

MM-Narrator: Narrating Long-form Videos with Multimodal In-Context Learning

Add code
Nov 29, 2023
Figure 1 for MM-Narrator: Narrating Long-form Videos with Multimodal In-Context Learning
Figure 2 for MM-Narrator: Narrating Long-form Videos with Multimodal In-Context Learning
Figure 3 for MM-Narrator: Narrating Long-form Videos with Multimodal In-Context Learning
Figure 4 for MM-Narrator: Narrating Long-form Videos with Multimodal In-Context Learning
Viaarxiv icon

GPT-4V in Wonderland: Large Multimodal Models for Zero-Shot Smartphone GUI Navigation

Add code
Nov 13, 2023
Figure 1 for GPT-4V in Wonderland: Large Multimodal Models for Zero-Shot Smartphone GUI Navigation
Figure 2 for GPT-4V in Wonderland: Large Multimodal Models for Zero-Shot Smartphone GUI Navigation
Figure 3 for GPT-4V in Wonderland: Large Multimodal Models for Zero-Shot Smartphone GUI Navigation
Figure 4 for GPT-4V in Wonderland: Large Multimodal Models for Zero-Shot Smartphone GUI Navigation
Viaarxiv icon

MM-VID: Advancing Video Understanding with GPT-4V

Add code
Oct 30, 2023
Figure 1 for MM-VID: Advancing Video Understanding with GPT-4V
Figure 2 for MM-VID: Advancing Video Understanding with GPT-4V
Figure 3 for MM-VID: Advancing Video Understanding with GPT-4V
Figure 4 for MM-VID: Advancing Video Understanding with GPT-4V
Viaarxiv icon

DEsignBench: Exploring and Benchmarking DALL-E 3 for Imagining Visual Design

Add code
Oct 23, 2023
Viaarxiv icon

Idea2Img: Iterative Self-Refinement with GPT-4V(ision) for Automatic Image Design and Generation

Add code
Oct 12, 2023
Viaarxiv icon

The Dawn of LMMs: Preliminary Explorations with GPT-4V(ision)

Add code
Oct 11, 2023
Viaarxiv icon

OpenLEAF: Open-Domain Interleaved Image-Text Generation and Evaluation

Add code
Oct 11, 2023
Figure 1 for OpenLEAF: Open-Domain Interleaved Image-Text Generation and Evaluation
Figure 2 for OpenLEAF: Open-Domain Interleaved Image-Text Generation and Evaluation
Figure 3 for OpenLEAF: Open-Domain Interleaved Image-Text Generation and Evaluation
Figure 4 for OpenLEAF: Open-Domain Interleaved Image-Text Generation and Evaluation
Viaarxiv icon

NP-SemiSeg: When Neural Processes meet Semi-Supervised Semantic Segmentation

Add code
Aug 05, 2023
Figure 1 for NP-SemiSeg: When Neural Processes meet Semi-Supervised Semantic Segmentation
Figure 2 for NP-SemiSeg: When Neural Processes meet Semi-Supervised Semantic Segmentation
Figure 3 for NP-SemiSeg: When Neural Processes meet Semi-Supervised Semantic Segmentation
Figure 4 for NP-SemiSeg: When Neural Processes meet Semi-Supervised Semantic Segmentation
Viaarxiv icon

MM-Vet: Evaluating Large Multimodal Models for Integrated Capabilities

Add code
Aug 04, 2023
Figure 1 for MM-Vet: Evaluating Large Multimodal Models for Integrated Capabilities
Figure 2 for MM-Vet: Evaluating Large Multimodal Models for Integrated Capabilities
Figure 3 for MM-Vet: Evaluating Large Multimodal Models for Integrated Capabilities
Figure 4 for MM-Vet: Evaluating Large Multimodal Models for Integrated Capabilities
Viaarxiv icon