Alert button
Picture for Bang Yang

Bang Yang

Alert button

VisionGPT-3D: A Generalized Multimodal Agent for Enhanced 3D Vision Understanding

Add code
Bookmark button
Alert button
Mar 22, 2024
Chris Kelly, Luhui Hu, Jiayin Hu, Yu Tian, Deshun Yang, Bang Yang, Cindy Yang, Zihao Li, Zaoshan Huang, Yuexian Zou

Figure 1 for VisionGPT-3D: A Generalized Multimodal Agent for Enhanced 3D Vision Understanding
Figure 2 for VisionGPT-3D: A Generalized Multimodal Agent for Enhanced 3D Vision Understanding
Figure 3 for VisionGPT-3D: A Generalized Multimodal Agent for Enhanced 3D Vision Understanding
Figure 4 for VisionGPT-3D: A Generalized Multimodal Agent for Enhanced 3D Vision Understanding
Viaarxiv icon

VisionGPT: Vision-Language Understanding Agent Using Generalized Multimodal Framework

Add code
Bookmark button
Alert button
Mar 14, 2024
Chris Kelly, Luhui Hu, Bang Yang, Yu Tian, Deshun Yang, Cindy Yang, Zaoshan Huang, Zihao Li, Jiayin Hu, Yuexian Zou

Figure 1 for VisionGPT: Vision-Language Understanding Agent Using Generalized Multimodal Framework
Figure 2 for VisionGPT: Vision-Language Understanding Agent Using Generalized Multimodal Framework
Figure 3 for VisionGPT: Vision-Language Understanding Agent Using Generalized Multimodal Framework
Figure 4 for VisionGPT: Vision-Language Understanding Agent Using Generalized Multimodal Framework
Viaarxiv icon

WorldGPT: A Sora-Inspired Video AI Agent as Rich World Models from Text and Image Inputs

Add code
Bookmark button
Alert button
Mar 10, 2024
Deshun Yang, Luhui Hu, Yu Tian, Zihao Li, Chris Kelly, Bang Yang, Cindy Yang, Yuexian Zou

Figure 1 for WorldGPT: A Sora-Inspired Video AI Agent as Rich World Models from Text and Image Inputs
Figure 2 for WorldGPT: A Sora-Inspired Video AI Agent as Rich World Models from Text and Image Inputs
Figure 3 for WorldGPT: A Sora-Inspired Video AI Agent as Rich World Models from Text and Image Inputs
Figure 4 for WorldGPT: A Sora-Inspired Video AI Agent as Rich World Models from Text and Image Inputs
Viaarxiv icon

Embracing Language Inclusivity and Diversity in CLIP through Continual Language Learning

Add code
Bookmark button
Alert button
Jan 30, 2024
Bang Yang, Yong Dai, Xuxin Cheng, Yaowei Li, Asif Raza, Yuexian Zou

Viaarxiv icon

Improving Medical Report Generation with Adapter Tuning and Knowledge Enhancement in Vision-Language Foundation Models

Add code
Bookmark button
Alert button
Dec 07, 2023
Shibin Wu, Bang Yang, Zhiyu Ye, Haoqian Wang, Hairong Zheng, Tong Zhang

Viaarxiv icon

UnifiedVisionGPT: Streamlining Vision-Oriented AI through Generalized Multimodal Framework

Add code
Bookmark button
Alert button
Nov 16, 2023
Chris Kelly, Luhui Hu, Cindy Yang, Yu Tian, Deshun Yang, Bang Yang, Zaoshan Huang, Zihao Li, Yuexian Zou

Viaarxiv icon

MultiCapCLIP: Auto-Encoding Prompts for Zero-Shot Multilingual Visual Captioning

Add code
Bookmark button
Alert button
Aug 25, 2023
Bang Yang, Fenglin Liu, Xian Wu, Yaowei Wang, Xu Sun, Yuexian Zou

Figure 1 for MultiCapCLIP: Auto-Encoding Prompts for Zero-Shot Multilingual Visual Captioning
Figure 2 for MultiCapCLIP: Auto-Encoding Prompts for Zero-Shot Multilingual Visual Captioning
Figure 3 for MultiCapCLIP: Auto-Encoding Prompts for Zero-Shot Multilingual Visual Captioning
Figure 4 for MultiCapCLIP: Auto-Encoding Prompts for Zero-Shot Multilingual Visual Captioning
Viaarxiv icon

Multimodal Prompt Learning for Product Title Generation with Extremely Limited Labels

Add code
Bookmark button
Alert button
Jul 05, 2023
Bang Yang, Fenglin Liu, Zheng Li, Qingyu Yin, Chenyu You, Bing Yin, Yuexian Zou

Figure 1 for Multimodal Prompt Learning for Product Title Generation with Extremely Limited Labels
Figure 2 for Multimodal Prompt Learning for Product Title Generation with Extremely Limited Labels
Figure 3 for Multimodal Prompt Learning for Product Title Generation with Extremely Limited Labels
Figure 4 for Multimodal Prompt Learning for Product Title Generation with Extremely Limited Labels
Viaarxiv icon

Customizing General-Purpose Foundation Models for Medical Report Generation

Add code
Bookmark button
Alert button
Jun 09, 2023
Bang Yang, Asif Raza, Yuexian Zou, Tong Zhang

Figure 1 for Customizing General-Purpose Foundation Models for Medical Report Generation
Figure 2 for Customizing General-Purpose Foundation Models for Medical Report Generation
Figure 3 for Customizing General-Purpose Foundation Models for Medical Report Generation
Figure 4 for Customizing General-Purpose Foundation Models for Medical Report Generation
Viaarxiv icon