Alert button
Picture for Bohan Zhai

Bohan Zhai

Alert button

InfiMM-HD: A Leap Forward in High-Resolution Multimodal Understanding

Add code
Bookmark button
Alert button
Mar 03, 2024
Haogeng Liu, Quanzeng You, Xiaotian Han, Yiqi Wang, Bohan Zhai, Yongfei Liu, Yunzhe Tao, Huaibo Huang, Ran He, Hongxia Yang

Figure 1 for InfiMM-HD: A Leap Forward in High-Resolution Multimodal Understanding
Figure 2 for InfiMM-HD: A Leap Forward in High-Resolution Multimodal Understanding
Figure 3 for InfiMM-HD: A Leap Forward in High-Resolution Multimodal Understanding
Figure 4 for InfiMM-HD: A Leap Forward in High-Resolution Multimodal Understanding
Viaarxiv icon

Exploring the Reasoning Abilities of Multimodal Large Language Models (MLLMs): A Comprehensive Survey on Emerging Trends in Multimodal Reasoning

Add code
Bookmark button
Alert button
Jan 18, 2024
Yiqi Wang, Wentao Chen, Xiaotian Han, Xudong Lin, Haiteng Zhao, Yongfei Liu, Bohan Zhai, Jianbo Yuan, Quanzeng You, Hongxia Yang

Viaarxiv icon

COCO is "ALL'' You Need for Visual Instruction Fine-tuning

Add code
Bookmark button
Alert button
Jan 17, 2024
Xiaotian Han, Yiqi Wang, Bohan Zhai, Quanzeng You, Hongxia Yang

Viaarxiv icon

InfiMM-Eval: Complex Open-Ended Reasoning Evaluation For Multi-Modal Large Language Models

Add code
Bookmark button
Alert button
Dec 04, 2023
Xiaotian Han, Quanzeng You, Yongfei Liu, Wentao Chen, Huangjie Zheng, Khalil Mrini, Xudong Lin, Yiqi Wang, Bohan Zhai, Jianbo Yuan, Heng Wang, Hongxia Yang

Viaarxiv icon

CORE-MM: Complex Open-Ended Reasoning Evaluation For Multi-Modal Large Language Models

Add code
Bookmark button
Alert button
Nov 27, 2023
Xiaotian Han, Quanzeng You, Yongfei Liu, Wentao Chen, Huangjie Zheng, Khalil Mrini, Xudong Lin, Yiqi Wang, Bohan Zhai, Jianbo Yuan, Heng Wang, Hongxia Yang

Viaarxiv icon

HallE-Switch: Rethinking and Controlling Object Existence Hallucinations in Large Vision Language Models for Detailed Caption

Add code
Bookmark button
Alert button
Oct 03, 2023
Bohan Zhai, Shijia Yang, Xiangchen Zhao, Chenfeng Xu, Sheng Shen, Dongdi Zhao, Kurt Keutzer, Manling Li, Tan Yan, Xiangjun Fan

Figure 1 for HallE-Switch: Rethinking and Controlling Object Existence Hallucinations in Large Vision Language Models for Detailed Caption
Figure 2 for HallE-Switch: Rethinking and Controlling Object Existence Hallucinations in Large Vision Language Models for Detailed Caption
Figure 3 for HallE-Switch: Rethinking and Controlling Object Existence Hallucinations in Large Vision Language Models for Detailed Caption
Figure 4 for HallE-Switch: Rethinking and Controlling Object Existence Hallucinations in Large Vision Language Models for Detailed Caption
Viaarxiv icon

Multitask Vision-Language Prompt Tuning

Add code
Bookmark button
Alert button
Dec 05, 2022
Sheng Shen, Shijia Yang, Tianjun Zhang, Bohan Zhai, Joseph E. Gonzalez, Kurt Keutzer, Trevor Darrell

Figure 1 for Multitask Vision-Language Prompt Tuning
Figure 2 for Multitask Vision-Language Prompt Tuning
Figure 3 for Multitask Vision-Language Prompt Tuning
Figure 4 for Multitask Vision-Language Prompt Tuning
Viaarxiv icon