Get our free extension to see links to code for papers anywhere online!

Chrome logo  Add to Chrome

Firefox logo Add to Firefox

Coarse-to-Fine Vision-Language Pre-training with Fusion in the Backbone


Jun 15, 2022
Zi-Yi Dou, Aishwarya Kamath, Zhe Gan, Pengchuan Zhang, Jianfeng Wang, Linjie Li, Zicheng Liu, Ce Liu, Yann LeCun, Nanyun Peng, Jianfeng Gao, Lijuan Wang

* Project Website: https://ashkamath.github.io/FIBER_page 

   Access Paper or Ask Questions

 • Share via Twitter
 • Share via Facebook
 • Share via LinkedIn
 • Share via Whatsapp
 • Share via Messenger
 • Share via Email

LAVENDER: Unifying Video-Language Understanding as Masked Language Modeling


Jun 14, 2022
Linjie Li, Zhe Gan, Kevin Lin, Chung-Ching Lin, Zicheng Liu, Ce Liu, Lijuan Wang


   Access Paper or Ask Questions

 • Share via Twitter
 • Share via Facebook
 • Share via LinkedIn
 • Share via Whatsapp
 • Share via Messenger
 • Share via Email

GIT: A Generative Image-to-text Transformer for Vision and Language


May 31, 2022
Jianfeng Wang, Zhengyuan Yang, Xiaowei Hu, Linjie Li, Kevin Lin, Zhe Gan, Zicheng Liu, Ce Liu, Lijuan Wang


   Access Paper or Ask Questions

 • Share via Twitter
 • Share via Facebook
 • Share via LinkedIn
 • Share via Whatsapp
 • Share via Messenger
 • Share via Email

Cross-modal Representation Learning for Zero-shot Action Recognition


May 03, 2022
Chung-Ching Lin, Kevin Lin, Linjie Li, Lijuan Wang, Zicheng Liu

* CVPR 2022 

   Access Paper or Ask Questions

 • Share via Twitter
 • Share via Facebook
 • Share via LinkedIn
 • Share via Whatsapp
 • Share via Messenger
 • Share via Email

MLP Architectures for Vision-and-Language Modeling: An Empirical Study


Dec 08, 2021
Yixin Nie, Linjie Li, Zhe Gan, Shuohang Wang, Chenguang Zhu, Michael Zeng, Zicheng Liu, Mohit Bansal, Lijuan Wang

* 15 pages 

   Access Paper or Ask Questions

 • Share via Twitter
 • Share via Facebook
 • Share via LinkedIn
 • Share via Whatsapp
 • Share via Messenger
 • Share via Email

SwinBERT: End-to-End Transformers with Sparse Attention for Video Captioning


Nov 25, 2021
Kevin Lin, Linjie Li, Chung-Ching Lin, Faisal Ahmed, Zhe Gan, Zicheng Liu, Yumao Lu, Lijuan Wang


   Access Paper or Ask Questions

 • Share via Twitter
 • Share via Facebook
 • Share via LinkedIn
 • Share via Whatsapp
 • Share via Messenger
 • Share via Email

VIOLET : End-to-End Video-Language Transformers with Masked Visual-token Modeling


Nov 24, 2021
Tsu-Jui Fu, Linjie Li, Zhe Gan, Kevin Lin, William Yang Wang, Lijuan Wang, Zicheng Liu

* Code is available at https://github.com/tsujuifu/pytorch_violet 

   Access Paper or Ask Questions

 • Share via Twitter
 • Share via Facebook
 • Share via LinkedIn
 • Share via Whatsapp
 • Share via Messenger
 • Share via Email

VALUE: A Multi-Task Benchmark for Video-and-Language Understanding Evaluation


Jun 08, 2021
Linjie Li, Jie Lei, Zhe Gan, Licheng Yu, Yen-Chun Chen, Rohit Pillai, Yu Cheng, Luowei Zhou, Xin Eric Wang, William Yang Wang, Tamara Lee Berg, Mohit Bansal, Jingjing Liu, Lijuan Wang, Zicheng Liu

* VALUE is available at https://value-leaderboard.github.io/ 

   Access Paper or Ask Questions

 • Share via Twitter
 • Share via Facebook
 • Share via LinkedIn
 • Share via Whatsapp
 • Share via Messenger
 • Share via Email

Adversarial VQA: A New Benchmark for Evaluating the Robustness of VQA Models


Jun 01, 2021
Linjie Li, Jie Lei, Zhe Gan, Jingjing Liu


   Access Paper or Ask Questions

 • Share via Twitter
 • Share via Facebook
 • Share via LinkedIn
 • Share via Whatsapp
 • Share via Messenger
 • Share via Email

Playing Lottery Tickets with Vision and Language


Apr 23, 2021
Zhe Gan, Yen-Chun Chen, Linjie Li, Tianlong Chen, Yu Cheng, Shuohang Wang, Jingjing Liu


   Access Paper or Ask Questions

 • Share via Twitter
 • Share via Facebook
 • Share via LinkedIn
 • Share via Whatsapp
 • Share via Messenger
 • Share via Email
1
2
3
>>