Alert button
Picture for Xiyang Dai

Xiyang Dai

Alert button

Visual Clues: Bridging Vision and Language Foundations for Image Paragraph Captioning

Add code
Bookmark button
Alert button
Jun 03, 2022
Yujia Xie, Luowei Zhou, Xiyang Dai, Lu Yuan, Nguyen Bach, Ce Liu, Michael Zeng

Figure 1 for Visual Clues: Bridging Vision and Language Foundations for Image Paragraph Captioning
Figure 2 for Visual Clues: Bridging Vision and Language Foundations for Image Paragraph Captioning
Figure 3 for Visual Clues: Bridging Vision and Language Foundations for Image Paragraph Captioning
Figure 4 for Visual Clues: Bridging Vision and Language Foundations for Image Paragraph Captioning
Viaarxiv icon

Reduce Information Loss in Transformers for Pluralistic Image Inpainting

Add code
Bookmark button
Alert button
May 15, 2022
Qiankun Liu, Zhentao Tan, Dongdong Chen, Qi Chu, Xiyang Dai, Yinpeng Chen, Mengchen Liu, Lu Yuan, Nenghai Yu

Figure 1 for Reduce Information Loss in Transformers for Pluralistic Image Inpainting
Figure 2 for Reduce Information Loss in Transformers for Pluralistic Image Inpainting
Figure 3 for Reduce Information Loss in Transformers for Pluralistic Image Inpainting
Figure 4 for Reduce Information Loss in Transformers for Pluralistic Image Inpainting
Viaarxiv icon

Multimodal Adaptive Distillation for Leveraging Unimodal Encoders for Vision-Language Tasks

Add code
Bookmark button
Alert button
Apr 28, 2022
Zhecan Wang, Noel Codella, Yen-Chun Chen, Luowei Zhou, Xiyang Dai, Bin Xiao, Jianwei Yang, Haoxuan You, Kai-Wei Chang, Shih-fu Chang, Lu Yuan

Figure 1 for Multimodal Adaptive Distillation for Leveraging Unimodal Encoders for Vision-Language Tasks
Figure 2 for Multimodal Adaptive Distillation for Leveraging Unimodal Encoders for Vision-Language Tasks
Figure 3 for Multimodal Adaptive Distillation for Leveraging Unimodal Encoders for Vision-Language Tasks
Figure 4 for Multimodal Adaptive Distillation for Leveraging Unimodal Encoders for Vision-Language Tasks
Viaarxiv icon

Residual Mixture of Experts

Add code
Bookmark button
Alert button
Apr 20, 2022
Lemeng Wu, Mengchen Liu, Yinpeng Chen, Dongdong Chen, Xiyang Dai, Lu Yuan

Figure 1 for Residual Mixture of Experts
Figure 2 for Residual Mixture of Experts
Figure 3 for Residual Mixture of Experts
Figure 4 for Residual Mixture of Experts
Viaarxiv icon

CLIP-TD: CLIP Targeted Distillation for Vision-Language Tasks

Add code
Bookmark button
Alert button
Jan 15, 2022
Zhecan Wang, Noel Codella, Yen-Chun Chen, Luowei Zhou, Jianwei Yang, Xiyang Dai, Bin Xiao, Haoxuan You, Shih-Fu Chang, Lu Yuan

Figure 1 for CLIP-TD: CLIP Targeted Distillation for Vision-Language Tasks
Figure 2 for CLIP-TD: CLIP Targeted Distillation for Vision-Language Tasks
Figure 3 for CLIP-TD: CLIP Targeted Distillation for Vision-Language Tasks
Figure 4 for CLIP-TD: CLIP Targeted Distillation for Vision-Language Tasks
Viaarxiv icon

RegionCLIP: Region-based Language-Image Pretraining

Add code
Bookmark button
Alert button
Dec 16, 2021
Yiwu Zhong, Jianwei Yang, Pengchuan Zhang, Chunyuan Li, Noel Codella, Liunian Harold Li, Luowei Zhou, Xiyang Dai, Lu Yuan, Yin Li, Jianfeng Gao

Figure 1 for RegionCLIP: Region-based Language-Image Pretraining
Figure 2 for RegionCLIP: Region-based Language-Image Pretraining
Figure 3 for RegionCLIP: Region-based Language-Image Pretraining
Figure 4 for RegionCLIP: Region-based Language-Image Pretraining
Viaarxiv icon

BEVT: BERT Pretraining of Video Transformers

Add code
Bookmark button
Alert button
Dec 02, 2021
Rui Wang, Dongdong Chen, Zuxuan Wu, Yinpeng Chen, Xiyang Dai, Mengchen Liu, Yu-Gang Jiang, Luowei Zhou, Lu Yuan

Figure 1 for BEVT: BERT Pretraining of Video Transformers
Figure 2 for BEVT: BERT Pretraining of Video Transformers
Figure 3 for BEVT: BERT Pretraining of Video Transformers
Figure 4 for BEVT: BERT Pretraining of Video Transformers
Viaarxiv icon

Florence: A New Foundation Model for Computer Vision

Add code
Bookmark button
Alert button
Nov 22, 2021
Lu Yuan, Dongdong Chen, Yi-Ling Chen, Noel Codella, Xiyang Dai, Jianfeng Gao, Houdong Hu, Xuedong Huang, Boxin Li, Chunyuan Li, Ce Liu, Mengchen Liu, Zicheng Liu, Yumao Lu, Yu Shi, Lijuan Wang, Jianfeng Wang, Bin Xiao, Zhen Xiao, Jianwei Yang, Michael Zeng, Luowei Zhou, Pengchuan Zhang

Figure 1 for Florence: A New Foundation Model for Computer Vision
Figure 2 for Florence: A New Foundation Model for Computer Vision
Figure 3 for Florence: A New Foundation Model for Computer Vision
Figure 4 for Florence: A New Foundation Model for Computer Vision
Viaarxiv icon

UFO: A UniFied TransfOrmer for Vision-Language Representation Learning

Add code
Bookmark button
Alert button
Nov 19, 2021
Jianfeng Wang, Xiaowei Hu, Zhe Gan, Zhengyuan Yang, Xiyang Dai, Zicheng Liu, Yumao Lu, Lijuan Wang

Figure 1 for UFO: A UniFied TransfOrmer for Vision-Language Representation Learning
Figure 2 for UFO: A UniFied TransfOrmer for Vision-Language Representation Learning
Figure 3 for UFO: A UniFied TransfOrmer for Vision-Language Representation Learning
Figure 4 for UFO: A UniFied TransfOrmer for Vision-Language Representation Learning
Viaarxiv icon

Mobile-Former: Bridging MobileNet and Transformer

Add code
Bookmark button
Alert button
Aug 12, 2021
Yinpeng Chen, Xiyang Dai, Dongdong Chen, Mengchen Liu, Xiaoyi Dong, Lu Yuan, Zicheng Liu

Figure 1 for Mobile-Former: Bridging MobileNet and Transformer
Figure 2 for Mobile-Former: Bridging MobileNet and Transformer
Figure 3 for Mobile-Former: Bridging MobileNet and Transformer
Figure 4 for Mobile-Former: Bridging MobileNet and Transformer
Viaarxiv icon