Alert button
Picture for Zhicheng Huang

Zhicheng Huang

Alert button

PixelLM: Pixel Reasoning with Large Multimodal Model

Add code
Bookmark button
Alert button
Dec 04, 2023
Zhongwei Ren, Zhicheng Huang, Yunchao Wei, Yao Zhao, Dongmei Fu, Jiashi Feng, Xiaojie Jin

Viaarxiv icon

VLAB: Enhancing Video Language Pre-training by Feature Adapting and Blending

Add code
Bookmark button
Alert button
May 22, 2023
Xingjian He, Sihan Chen, Fan Ma, Zhicheng Huang, Xiaojie Jin, Zikang Liu, Dongmei Fu, Yi Yang, Jing Liu, Jiashi Feng

Figure 1 for VLAB: Enhancing Video Language Pre-training by Feature Adapting and Blending
Figure 2 for VLAB: Enhancing Video Language Pre-training by Feature Adapting and Blending
Figure 3 for VLAB: Enhancing Video Language Pre-training by Feature Adapting and Blending
Figure 4 for VLAB: Enhancing Video Language Pre-training by Feature Adapting and Blending
Viaarxiv icon

CMAE-V: Contrastive Masked Autoencoders for Video Action Recognition

Add code
Bookmark button
Alert button
Jan 15, 2023
Cheng-Ze Lu, Xiaojie Jin, Zhicheng Huang, Qibin Hou, Ming-Ming Cheng, Jiashi Feng

Figure 1 for CMAE-V: Contrastive Masked Autoencoders for Video Action Recognition
Figure 2 for CMAE-V: Contrastive Masked Autoencoders for Video Action Recognition
Figure 3 for CMAE-V: Contrastive Masked Autoencoders for Video Action Recognition
Viaarxiv icon

Contrastive Masked Autoencoders are Stronger Vision Learners

Add code
Bookmark button
Alert button
Jul 27, 2022
Zhicheng Huang, Xiaojie Jin, Chengze Lu, Qibin Hou, Ming-Ming Cheng, Dongmei Fu, Xiaohui Shen, Jiashi Feng

Figure 1 for Contrastive Masked Autoencoders are Stronger Vision Learners
Figure 2 for Contrastive Masked Autoencoders are Stronger Vision Learners
Figure 3 for Contrastive Masked Autoencoders are Stronger Vision Learners
Figure 4 for Contrastive Masked Autoencoders are Stronger Vision Learners
Viaarxiv icon

Seeing Out of tHe bOx: End-to-End Pre-training for Vision-Language Representation Learning

Add code
Bookmark button
Alert button
Apr 08, 2021
Zhicheng Huang, Zhaoyang Zeng, Yupan Huang, Bei Liu, Dongmei Fu, Jianlong Fu

Figure 1 for Seeing Out of tHe bOx: End-to-End Pre-training for Vision-Language Representation Learning
Figure 2 for Seeing Out of tHe bOx: End-to-End Pre-training for Vision-Language Representation Learning
Figure 3 for Seeing Out of tHe bOx: End-to-End Pre-training for Vision-Language Representation Learning
Figure 4 for Seeing Out of tHe bOx: End-to-End Pre-training for Vision-Language Representation Learning
Viaarxiv icon

Pixel-BERT: Aligning Image Pixels with Text by Deep Multi-Modal Transformers

Add code
Bookmark button
Alert button
Apr 02, 2020
Zhicheng Huang, Zhaoyang Zeng, Bei Liu, Dongmei Fu, Jianlong Fu

Figure 1 for Pixel-BERT: Aligning Image Pixels with Text by Deep Multi-Modal Transformers
Figure 2 for Pixel-BERT: Aligning Image Pixels with Text by Deep Multi-Modal Transformers
Figure 3 for Pixel-BERT: Aligning Image Pixels with Text by Deep Multi-Modal Transformers
Figure 4 for Pixel-BERT: Aligning Image Pixels with Text by Deep Multi-Modal Transformers
Viaarxiv icon

Learning Rich Image Region Representation for Visual Question Answering

Add code
Bookmark button
Alert button
Oct 29, 2019
Bei Liu, Zhicheng Huang, Zhaoyang Zeng, Zheyu Chen, Jianlong Fu

Figure 1 for Learning Rich Image Region Representation for Visual Question Answering
Viaarxiv icon