Alert button
Picture for Zaoshan Huang

Zaoshan Huang

Alert button

VisionGPT-3D: A Generalized Multimodal Agent for Enhanced 3D Vision Understanding

Add code
Bookmark button
Alert button
Mar 22, 2024
Chris Kelly, Luhui Hu, Jiayin Hu, Yu Tian, Deshun Yang, Bang Yang, Cindy Yang, Zihao Li, Zaoshan Huang, Yuexian Zou

Figure 1 for VisionGPT-3D: A Generalized Multimodal Agent for Enhanced 3D Vision Understanding
Figure 2 for VisionGPT-3D: A Generalized Multimodal Agent for Enhanced 3D Vision Understanding
Figure 3 for VisionGPT-3D: A Generalized Multimodal Agent for Enhanced 3D Vision Understanding
Figure 4 for VisionGPT-3D: A Generalized Multimodal Agent for Enhanced 3D Vision Understanding
Viaarxiv icon

VisionGPT: Vision-Language Understanding Agent Using Generalized Multimodal Framework

Add code
Bookmark button
Alert button
Mar 14, 2024
Chris Kelly, Luhui Hu, Bang Yang, Yu Tian, Deshun Yang, Cindy Yang, Zaoshan Huang, Zihao Li, Jiayin Hu, Yuexian Zou

Figure 1 for VisionGPT: Vision-Language Understanding Agent Using Generalized Multimodal Framework
Figure 2 for VisionGPT: Vision-Language Understanding Agent Using Generalized Multimodal Framework
Figure 3 for VisionGPT: Vision-Language Understanding Agent Using Generalized Multimodal Framework
Figure 4 for VisionGPT: Vision-Language Understanding Agent Using Generalized Multimodal Framework
Viaarxiv icon

UnifiedVisionGPT: Streamlining Vision-Oriented AI through Generalized Multimodal Framework

Add code
Bookmark button
Alert button
Nov 16, 2023
Chris Kelly, Luhui Hu, Cindy Yang, Yu Tian, Deshun Yang, Bang Yang, Zaoshan Huang, Zihao Li, Yuexian Zou

Viaarxiv icon