Get our free extension to see links to code for papers anywhere online!

Chrome logo  Add to Chrome

Firefox logo Add to Firefox

GLIPv2: Unifying Localization and Vision-Language UnderstandingHaotian Zhang , Pengchuan Zhang , Xiaowei Hu , Yen-Chun Chen , Liunian Harold Li , Xiyang Dai , Lijuan Wang , Lu Yuan , Jenq-Neng Hwang , Jianfeng Gao

* Code will be released at https://github.com/microsoft/GLIP 

   Access Paper or Ask Questions

Detection Hub: Unifying Object Detection Datasets via Query Adaptation on Language EmbeddingLingchen Meng , Xiyang Dai , Yinpeng Chen , Pengchuan Zhang , Dongdong Chen , Mengchen Liu , Jianfeng Wang , Zuxuan Wu , Lu Yuan , Yu-Gang Jiang


   Access Paper or Ask Questions

Visual Clues: Bridging Vision and Language Foundations for Image Paragraph CaptioningYujia Xie , Luowei Zhou , Xiyang Dai , Lu Yuan , Nguyen Bach , Ce Liu , Michael Zeng


   Access Paper or Ask Questions

Reduce Information Loss in Transformers for Pluralistic Image InpaintingQiankun Liu , Zhentao Tan , Dongdong Chen , Qi Chu , Xiyang Dai , Yinpeng Chen , Mengchen Liu , Lu Yuan , Nenghai Yu

* CVPR 2022, code is available at https://github.com/liuqk3/PUT 

   Access Paper or Ask Questions

Multimodal Adaptive Distillation for Leveraging Unimodal Encoders for Vision-Language TasksZhecan Wang , Noel Codella , Yen-Chun Chen , Luowei Zhou , Xiyang Dai , Bin Xiao , Jianwei Yang , Haoxuan You , Kai-Wei Chang , Shih-fu Chang , Lu Yuan

* arXiv admin note: substantial text overlap with arXiv:2201.05729 

   Access Paper or Ask Questions

Residual Mixture of ExpertsLemeng Wu , Mengchen Liu , Yinpeng Chen , Dongdong Chen , Xiyang Dai , Lu Yuan


   Access Paper or Ask Questions

CLIP-TD: CLIP Targeted Distillation for Vision-Language TasksZhecan Wang , Noel Codella , Yen-Chun Chen , Luowei Zhou , Jianwei Yang , Xiyang Dai , Bin Xiao , Haoxuan You , Shih-Fu Chang , Lu Yuan


   Access Paper or Ask Questions

RegionCLIP: Region-based Language-Image PretrainingYiwu Zhong , Jianwei Yang , Pengchuan Zhang , Chunyuan Li , Noel Codella , Liunian Harold Li , Luowei Zhou , Xiyang Dai , Lu Yuan , Yin Li , Jianfeng Gao

* Technical report 

   Access Paper or Ask Questions

BEVT: BERT Pretraining of Video TransformersRui Wang , Dongdong Chen , Zuxuan Wu , Yinpeng Chen , Xiyang Dai , Mengchen Liu , Yu-Gang Jiang , Luowei Zhou , Lu Yuan


   Access Paper or Ask Questions

Florence: A New Foundation Model for Computer VisionLu Yuan , Dongdong Chen , Yi-Ling Chen , Noel Codella , Xiyang Dai , Jianfeng Gao , Houdong Hu , Xuedong Huang , Boxin Li , Chunyuan Li , Ce Liu , Mengchen Liu , Zicheng Liu , Yumao Lu , Yu Shi , Lijuan Wang , Jianfeng Wang , Bin Xiao , Zhen Xiao , Jianwei Yang , Michael Zeng , Luowei Zhou , Pengchuan Zhang


   Access Paper or Ask Questions

1
2
3
4
>>