Alert button
Picture for Handong Li

Handong Li

Alert button

COSA: Concatenated Sample Pretrained Vision-Language Foundation Model

Add code
Bookmark button
Alert button
Jun 15, 2023
Sihan Chen, Xingjian He, Handong Li, Xiaojie Jin, Jiashi Feng, Jing Liu

Figure 1 for COSA: Concatenated Sample Pretrained Vision-Language Foundation Model
Figure 2 for COSA: Concatenated Sample Pretrained Vision-Language Foundation Model
Figure 3 for COSA: Concatenated Sample Pretrained Vision-Language Foundation Model
Figure 4 for COSA: Concatenated Sample Pretrained Vision-Language Foundation Model
Viaarxiv icon

VAST: A Vision-Audio-Subtitle-Text Omni-Modality Foundation Model and Dataset

Add code
Bookmark button
Alert button
May 29, 2023
Sihan Chen, Handong Li, Qunbo Wang, Zijia Zhao, Mingzhen Sun, Xinxin Zhu, Jing Liu

Figure 1 for VAST: A Vision-Audio-Subtitle-Text Omni-Modality Foundation Model and Dataset
Figure 2 for VAST: A Vision-Audio-Subtitle-Text Omni-Modality Foundation Model and Dataset
Figure 3 for VAST: A Vision-Audio-Subtitle-Text Omni-Modality Foundation Model and Dataset
Figure 4 for VAST: A Vision-Audio-Subtitle-Text Omni-Modality Foundation Model and Dataset
Viaarxiv icon

Enhancing Vision-Language Pre-Training with Jointly Learned Questioner and Dense Captioner

Add code
Bookmark button
Alert button
May 19, 2023
Zikang Liu, Sihan Chen, Longteng Guo, Handong Li, Xingjian He, Jing Liu

Figure 1 for Enhancing Vision-Language Pre-Training with Jointly Learned Questioner and Dense Captioner
Figure 2 for Enhancing Vision-Language Pre-Training with Jointly Learned Questioner and Dense Captioner
Figure 3 for Enhancing Vision-Language Pre-Training with Jointly Learned Questioner and Dense Captioner
Figure 4 for Enhancing Vision-Language Pre-Training with Jointly Learned Questioner and Dense Captioner
Viaarxiv icon