Alert button
Picture for Zicheng Liu

Zicheng Liu

Alert button

Decoupling Object Detection from Human-Object Interaction Recognition

Add code
Bookmark button
Alert button
Dec 13, 2021
Ying Jin, Yinpeng Chen, Lijuan Wang, Jianfeng Wang, Pei Yu, Lin Liang, Jenq-Neng Hwang, Zicheng Liu

Figure 1 for Decoupling Object Detection from Human-Object Interaction Recognition
Figure 2 for Decoupling Object Detection from Human-Object Interaction Recognition
Figure 3 for Decoupling Object Detection from Human-Object Interaction Recognition
Figure 4 for Decoupling Object Detection from Human-Object Interaction Recognition
Viaarxiv icon

Improving Vision Transformers for Incremental Learning

Add code
Bookmark button
Alert button
Dec 12, 2021
Pei Yu, Yinpeng Chen, Ying Jin, Zicheng Liu

Figure 1 for Improving Vision Transformers for Incremental Learning
Figure 2 for Improving Vision Transformers for Incremental Learning
Figure 3 for Improving Vision Transformers for Incremental Learning
Figure 4 for Improving Vision Transformers for Incremental Learning
Viaarxiv icon

Injecting Semantic Concepts into End-to-End Image Captioning

Add code
Bookmark button
Alert button
Dec 09, 2021
Zhiyuan Fang, Jianfeng Wang, Xiaowei Hu, Lin Liang, Zhe Gan, Lijuan Wang, Yezhou Yang, Zicheng Liu

Figure 1 for Injecting Semantic Concepts into End-to-End Image Captioning
Figure 2 for Injecting Semantic Concepts into End-to-End Image Captioning
Figure 3 for Injecting Semantic Concepts into End-to-End Image Captioning
Figure 4 for Injecting Semantic Concepts into End-to-End Image Captioning
Viaarxiv icon

MLP Architectures for Vision-and-Language Modeling: An Empirical Study

Add code
Bookmark button
Alert button
Dec 08, 2021
Yixin Nie, Linjie Li, Zhe Gan, Shuohang Wang, Chenguang Zhu, Michael Zeng, Zicheng Liu, Mohit Bansal, Lijuan Wang

Figure 1 for MLP Architectures for Vision-and-Language Modeling: An Empirical Study
Figure 2 for MLP Architectures for Vision-and-Language Modeling: An Empirical Study
Figure 3 for MLP Architectures for Vision-and-Language Modeling: An Empirical Study
Figure 4 for MLP Architectures for Vision-and-Language Modeling: An Empirical Study
Viaarxiv icon

Boosting Discriminative Visual Representation Learning with Scenario-Agnostic Mixup

Add code
Bookmark button
Alert button
Nov 30, 2021
Siyuan Li, Zicheng Liu, Di Wu, Zihan Liu, Stan Z. Li

Figure 1 for Boosting Discriminative Visual Representation Learning with Scenario-Agnostic Mixup
Figure 2 for Boosting Discriminative Visual Representation Learning with Scenario-Agnostic Mixup
Figure 3 for Boosting Discriminative Visual Representation Learning with Scenario-Agnostic Mixup
Figure 4 for Boosting Discriminative Visual Representation Learning with Scenario-Agnostic Mixup
Viaarxiv icon

MMPTRACK: Large-scale Densely Annotated Multi-camera Multiple People Tracking Benchmark

Add code
Bookmark button
Alert button
Nov 30, 2021
Xiaotian Han, Quanzeng You, Chunyu Wang, Zhizheng Zhang, Peng Chu, Houdong Hu, Jiang Wang, Zicheng Liu

Figure 1 for MMPTRACK: Large-scale Densely Annotated Multi-camera Multiple People Tracking Benchmark
Figure 2 for MMPTRACK: Large-scale Densely Annotated Multi-camera Multiple People Tracking Benchmark
Figure 3 for MMPTRACK: Large-scale Densely Annotated Multi-camera Multiple People Tracking Benchmark
Figure 4 for MMPTRACK: Large-scale Densely Annotated Multi-camera Multiple People Tracking Benchmark
Viaarxiv icon

SwinBERT: End-to-End Transformers with Sparse Attention for Video Captioning

Add code
Bookmark button
Alert button
Nov 25, 2021
Kevin Lin, Linjie Li, Chung-Ching Lin, Faisal Ahmed, Zhe Gan, Zicheng Liu, Yumao Lu, Lijuan Wang

Figure 1 for SwinBERT: End-to-End Transformers with Sparse Attention for Video Captioning
Figure 2 for SwinBERT: End-to-End Transformers with Sparse Attention for Video Captioning
Figure 3 for SwinBERT: End-to-End Transformers with Sparse Attention for Video Captioning
Figure 4 for SwinBERT: End-to-End Transformers with Sparse Attention for Video Captioning
Viaarxiv icon

An Empirical Study of Training End-to-End Vision-and-Language Transformers

Add code
Bookmark button
Alert button
Nov 25, 2021
Zi-Yi Dou, Yichong Xu, Zhe Gan, Jianfeng Wang, Shuohang Wang, Lijuan Wang, Chenguang Zhu, Pengchuan Zhang, Lu Yuan, Nanyun Peng, Zicheng Liu, Michael Zeng

Figure 1 for An Empirical Study of Training End-to-End Vision-and-Language Transformers
Figure 2 for An Empirical Study of Training End-to-End Vision-and-Language Transformers
Figure 3 for An Empirical Study of Training End-to-End Vision-and-Language Transformers
Figure 4 for An Empirical Study of Training End-to-End Vision-and-Language Transformers
Viaarxiv icon

VIOLET : End-to-End Video-Language Transformers with Masked Visual-token Modeling

Add code
Bookmark button
Alert button
Nov 24, 2021
Tsu-Jui Fu, Linjie Li, Zhe Gan, Kevin Lin, William Yang Wang, Lijuan Wang, Zicheng Liu

Figure 1 for VIOLET : End-to-End Video-Language Transformers with Masked Visual-token Modeling
Figure 2 for VIOLET : End-to-End Video-Language Transformers with Masked Visual-token Modeling
Figure 3 for VIOLET : End-to-End Video-Language Transformers with Masked Visual-token Modeling
Figure 4 for VIOLET : End-to-End Video-Language Transformers with Masked Visual-token Modeling
Viaarxiv icon

Scaling Up Vision-Language Pre-training for Image Captioning

Add code
Bookmark button
Alert button
Nov 24, 2021
Xiaowei Hu, Zhe Gan, Jianfeng Wang, Zhengyuan Yang, Zicheng Liu, Yumao Lu, Lijuan Wang

Figure 1 for Scaling Up Vision-Language Pre-training for Image Captioning
Figure 2 for Scaling Up Vision-Language Pre-training for Image Captioning
Figure 3 for Scaling Up Vision-Language Pre-training for Image Captioning
Figure 4 for Scaling Up Vision-Language Pre-training for Image Captioning
Viaarxiv icon