Alert button
Picture for Licheng Yu

Licheng Yu

Alert button

RoPAWS: Robust Semi-supervised Representation Learning from Uncurated Data

Add code
Bookmark button
Alert button
Feb 28, 2023
Sangwoo Mo, Jong-Chyi Su, Chih-Yao Ma, Mido Assran, Ishan Misra, Licheng Yu, Sean Bell

Figure 1 for RoPAWS: Robust Semi-supervised Representation Learning from Uncurated Data
Figure 2 for RoPAWS: Robust Semi-supervised Representation Learning from Uncurated Data
Figure 3 for RoPAWS: Robust Semi-supervised Representation Learning from Uncurated Data
Figure 4 for RoPAWS: Robust Semi-supervised Representation Learning from Uncurated Data
Viaarxiv icon

Que2Engage: Embedding-based Retrieval for Relevant and Engaging Products at Facebook Marketplace

Add code
Bookmark button
Alert button
Feb 21, 2023
Yunzhong He, Yuxin Tian, Mengjiao Wang, Feier Chen, Licheng Yu, Maolong Tang, Congcong Chen, Ning Zhang, Bin Kuang, Arul Prakash

Figure 1 for Que2Engage: Embedding-based Retrieval for Relevant and Engaging Products at Facebook Marketplace
Figure 2 for Que2Engage: Embedding-based Retrieval for Relevant and Engaging Products at Facebook Marketplace
Viaarxiv icon

CiT: Curation in Training for Effective Vision-Language Data

Add code
Bookmark button
Alert button
Jan 05, 2023
Hu Xu, Saining Xie, Po-Yao Huang, Licheng Yu, Russell Howes, Gargi Ghosh, Luke Zettlemoyer, Christoph Feichtenhofer

Figure 1 for CiT: Curation in Training for Effective Vision-Language Data
Figure 2 for CiT: Curation in Training for Effective Vision-Language Data
Figure 3 for CiT: Curation in Training for Effective Vision-Language Data
Figure 4 for CiT: Curation in Training for Effective Vision-Language Data
Viaarxiv icon

Tell Me What Happened: Unifying Text-guided Video Completion via Multimodal Masked Video Generation

Add code
Bookmark button
Alert button
Nov 23, 2022
Tsu-Jui Fu, Licheng Yu, Ning Zhang, Cheng-Yang Fu, Jong-Chyi Su, William Yang Wang, Sean Bell

Figure 1 for Tell Me What Happened: Unifying Text-guided Video Completion via Multimodal Masked Video Generation
Figure 2 for Tell Me What Happened: Unifying Text-guided Video Completion via Multimodal Masked Video Generation
Figure 3 for Tell Me What Happened: Unifying Text-guided Video Completion via Multimodal Masked Video Generation
Figure 4 for Tell Me What Happened: Unifying Text-guided Video Completion via Multimodal Masked Video Generation
Viaarxiv icon

FaD-VLP: Fashion Vision-and-Language Pre-training towards Unified Retrieval and Captioning

Add code
Bookmark button
Alert button
Oct 26, 2022
Suvir Mirchandani, Licheng Yu, Mengjiao Wang, Animesh Sinha, Wenwen Jiang, Tao Xiang, Ning Zhang

Figure 1 for FaD-VLP: Fashion Vision-and-Language Pre-training towards Unified Retrieval and Captioning
Figure 2 for FaD-VLP: Fashion Vision-and-Language Pre-training towards Unified Retrieval and Captioning
Figure 3 for FaD-VLP: Fashion Vision-and-Language Pre-training towards Unified Retrieval and Captioning
Figure 4 for FaD-VLP: Fashion Vision-and-Language Pre-training towards Unified Retrieval and Captioning
Viaarxiv icon

FashionViL: Fashion-Focused Vision-and-Language Representation Learning

Add code
Bookmark button
Alert button
Jul 17, 2022
Xiao Han, Licheng Yu, Xiatian Zhu, Li Zhang, Yi-Zhe Song, Tao Xiang

Figure 1 for FashionViL: Fashion-Focused Vision-and-Language Representation Learning
Figure 2 for FashionViL: Fashion-Focused Vision-and-Language Representation Learning
Figure 3 for FashionViL: Fashion-Focused Vision-and-Language Representation Learning
Figure 4 for FashionViL: Fashion-Focused Vision-and-Language Representation Learning
Viaarxiv icon

GEB+: A benchmark for generic event boundary captioning, grounding and text-based retrieval

Add code
Bookmark button
Alert button
Apr 10, 2022
Yuxuan Wang, Difei Gao, Licheng Yu, Stan Weixian Lei, Matt Feiszli, Mike Zheng Shou

Figure 1 for GEB+: A benchmark for generic event boundary captioning, grounding and text-based retrieval
Figure 2 for GEB+: A benchmark for generic event boundary captioning, grounding and text-based retrieval
Figure 3 for GEB+: A benchmark for generic event boundary captioning, grounding and text-based retrieval
Figure 4 for GEB+: A benchmark for generic event boundary captioning, grounding and text-based retrieval
Viaarxiv icon

LoopITR: Combining Dual and Cross Encoder Architectures for Image-Text Retrieval

Add code
Bookmark button
Alert button
Mar 10, 2022
Jie Lei, Xinlei Chen, Ning Zhang, Mengjiao Wang, Mohit Bansal, Tamara L. Berg, Licheng Yu

Figure 1 for LoopITR: Combining Dual and Cross Encoder Architectures for Image-Text Retrieval
Figure 2 for LoopITR: Combining Dual and Cross Encoder Architectures for Image-Text Retrieval
Figure 3 for LoopITR: Combining Dual and Cross Encoder Architectures for Image-Text Retrieval
Figure 4 for LoopITR: Combining Dual and Cross Encoder Architectures for Image-Text Retrieval
Viaarxiv icon

Unsupervised Vision-and-Language Pre-training via Retrieval-based Multi-Granular Alignment

Add code
Bookmark button
Alert button
Mar 01, 2022
Mingyang Zhou, Licheng Yu, Amanpreet Singh, Mengjiao Wang, Zhou Yu, Ning Zhang

Figure 1 for Unsupervised Vision-and-Language Pre-training via Retrieval-based Multi-Granular Alignment
Figure 2 for Unsupervised Vision-and-Language Pre-training via Retrieval-based Multi-Granular Alignment
Figure 3 for Unsupervised Vision-and-Language Pre-training via Retrieval-based Multi-Granular Alignment
Figure 4 for Unsupervised Vision-and-Language Pre-training via Retrieval-based Multi-Granular Alignment
Viaarxiv icon

CommerceMM: Large-Scale Commerce MultiModal Representation Learning with Omni Retrieval

Add code
Bookmark button
Alert button
Feb 15, 2022
Licheng Yu, Jun Chen, Animesh Sinha, Mengjiao MJ Wang, Hugo Chen, Tamara L. Berg, Ning Zhang

Figure 1 for CommerceMM: Large-Scale Commerce MultiModal Representation Learning with Omni Retrieval
Figure 2 for CommerceMM: Large-Scale Commerce MultiModal Representation Learning with Omni Retrieval
Figure 3 for CommerceMM: Large-Scale Commerce MultiModal Representation Learning with Omni Retrieval
Figure 4 for CommerceMM: Large-Scale Commerce MultiModal Representation Learning with Omni Retrieval
Viaarxiv icon