Alert button
Picture for Kanzhi Cheng

Kanzhi Cheng

Alert button

SeeClick: Harnessing GUI Grounding for Advanced Visual GUI Agents

Jan 17, 2024
Kanzhi Cheng, Qiushi Sun, Yougang Chu, Fangzhi Xu, Yantao Li, Jianbing Zhang, Zhiyong Wu

Viaarxiv icon

Food-500 Cap: A Fine-Grained Food Caption Benchmark for Evaluating Vision-Language Models

Aug 06, 2023
Zheng Ma, Mianzhi Pan, Wenhan Wu, Kanzhi Cheng, Jianbing Zhang, Shujian Huang, Jiajun Chen

Figure 1 for Food-500 Cap: A Fine-Grained Food Caption Benchmark for Evaluating Vision-Language Models
Figure 2 for Food-500 Cap: A Fine-Grained Food Caption Benchmark for Evaluating Vision-Language Models
Figure 3 for Food-500 Cap: A Fine-Grained Food Caption Benchmark for Evaluating Vision-Language Models
Figure 4 for Food-500 Cap: A Fine-Grained Food Caption Benchmark for Evaluating Vision-Language Models
Viaarxiv icon

ADS-Cap: A Framework for Accurate and Diverse Stylized Captioning with Unpaired Stylistic Corpora

Aug 02, 2023
Kanzhi Cheng, Zheng Ma, Shi Zong, Jianbing Zhang, Xinyu Dai, Jiajun Chen

Viaarxiv icon

Beyond Generic: Enhancing Image Captioning with Real-World Knowledge using Vision-Language Pre-Training Model

Aug 02, 2023
Kanzhi Cheng, Wenpo Song, Zheng Ma, Wenhao Zhu, Zixuan Zhu, Jianbing Zhang

Figure 1 for Beyond Generic: Enhancing Image Captioning with Real-World Knowledge using Vision-Language Pre-Training Model
Figure 2 for Beyond Generic: Enhancing Image Captioning with Real-World Knowledge using Vision-Language Pre-Training Model
Figure 3 for Beyond Generic: Enhancing Image Captioning with Real-World Knowledge using Vision-Language Pre-Training Model
Figure 4 for Beyond Generic: Enhancing Image Captioning with Real-World Knowledge using Vision-Language Pre-Training Model
Viaarxiv icon