Alert button
Picture for Yin Cui

Yin Cui

Alert button

Open-Vocabulary Image Segmentation

Add code
Bookmark button
Alert button
Dec 22, 2021
Golnaz Ghiasi, Xiuye Gu, Yin Cui, Tsung-Yi Lin

Figure 1 for Open-Vocabulary Image Segmentation
Figure 2 for Open-Vocabulary Image Segmentation
Figure 3 for Open-Vocabulary Image Segmentation
Figure 4 for Open-Vocabulary Image Segmentation
Viaarxiv icon

Towards a Unified Foundation Model: Jointly Pre-Training Transformers on Unpaired Images and Text

Add code
Bookmark button
Alert button
Dec 14, 2021
Qing Li, Boqing Gong, Yin Cui, Dan Kondratyuk, Xianzhi Du, Ming-Hsuan Yang, Matthew Brown

Figure 1 for Towards a Unified Foundation Model: Jointly Pre-Training Transformers on Unpaired Images and Text
Figure 2 for Towards a Unified Foundation Model: Jointly Pre-Training Transformers on Unpaired Images and Text
Figure 3 for Towards a Unified Foundation Model: Jointly Pre-Training Transformers on Unpaired Images and Text
Figure 4 for Towards a Unified Foundation Model: Jointly Pre-Training Transformers on Unpaired Images and Text
Viaarxiv icon

Contextualized Spatio-Temporal Contrastive Learning with Self-Supervision

Add code
Bookmark button
Alert button
Dec 09, 2021
Liangzhe Yuan, Rui Qian, Yin Cui, Boqing Gong, Florian Schroff, Ming-Hsuan Yang, Hartwig Adam, Ting Liu

Figure 1 for Contextualized Spatio-Temporal Contrastive Learning with Self-Supervision
Figure 2 for Contextualized Spatio-Temporal Contrastive Learning with Self-Supervision
Figure 3 for Contextualized Spatio-Temporal Contrastive Learning with Self-Supervision
Figure 4 for Contextualized Spatio-Temporal Contrastive Learning with Self-Supervision
Viaarxiv icon

Exploring Temporal Granularity in Self-Supervised Video Representation Learning

Add code
Bookmark button
Alert button
Dec 08, 2021
Rui Qian, Yeqing Li, Liangzhe Yuan, Boqing Gong, Ting Liu, Matthew Brown, Serge Belongie, Ming-Hsuan Yang, Hartwig Adam, Yin Cui

Figure 1 for Exploring Temporal Granularity in Self-Supervised Video Representation Learning
Figure 2 for Exploring Temporal Granularity in Self-Supervised Video Representation Learning
Figure 3 for Exploring Temporal Granularity in Self-Supervised Video Representation Learning
Figure 4 for Exploring Temporal Granularity in Self-Supervised Video Representation Learning
Viaarxiv icon

Revisiting 3D ResNets for Video Recognition

Add code
Bookmark button
Alert button
Sep 03, 2021
Xianzhi Du, Yeqing Li, Yin Cui, Rui Qian, Jing Li, Irwan Bello

Figure 1 for Revisiting 3D ResNets for Video Recognition
Figure 2 for Revisiting 3D ResNets for Video Recognition
Figure 3 for Revisiting 3D ResNets for Video Recognition
Figure 4 for Revisiting 3D ResNets for Video Recognition
Viaarxiv icon

Federated Multi-Target Domain Adaptation

Add code
Bookmark button
Alert button
Aug 17, 2021
Chun-Han Yao, Boqing Gong, Yin Cui, Hang Qi, Yukun Zhu, Ming-Hsuan Yang

Figure 1 for Federated Multi-Target Domain Adaptation
Figure 2 for Federated Multi-Target Domain Adaptation
Figure 3 for Federated Multi-Target Domain Adaptation
Figure 4 for Federated Multi-Target Domain Adaptation
Viaarxiv icon

Single Image Texture Translation for Data Augmentation

Add code
Bookmark button
Alert button
Jun 25, 2021
Boyi Li, Yin Cui, Tsung-Yi Lin, Serge Belongie

Figure 1 for Single Image Texture Translation for Data Augmentation
Figure 2 for Single Image Texture Translation for Data Augmentation
Figure 3 for Single Image Texture Translation for Data Augmentation
Figure 4 for Single Image Texture Translation for Data Augmentation
Viaarxiv icon

Bridging the Gap Between Object Detection and User Intent via Query-Modulation

Add code
Bookmark button
Alert button
Jun 18, 2021
Marco Fornoni, Chaochao Yan, Liangchen Luo, Kimberly Wilber, Alex Stark, Yin Cui, Boqing Gong, Andrew Howard

Figure 1 for Bridging the Gap Between Object Detection and User Intent via Query-Modulation
Figure 2 for Bridging the Gap Between Object Detection and User Intent via Query-Modulation
Figure 3 for Bridging the Gap Between Object Detection and User Intent via Query-Modulation
Figure 4 for Bridging the Gap Between Object Detection and User Intent via Query-Modulation
Viaarxiv icon

Zero-Shot Detection via Vision and Language Knowledge Distillation

Add code
Bookmark button
Alert button
Apr 28, 2021
Xiuye Gu, Tsung-Yi Lin, Weicheng Kuo, Yin Cui

Figure 1 for Zero-Shot Detection via Vision and Language Knowledge Distillation
Figure 2 for Zero-Shot Detection via Vision and Language Knowledge Distillation
Figure 3 for Zero-Shot Detection via Vision and Language Knowledge Distillation
Figure 4 for Zero-Shot Detection via Vision and Language Knowledge Distillation
Viaarxiv icon

VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text

Add code
Bookmark button
Alert button
Apr 22, 2021
Hassan Akbari, Linagzhe Yuan, Rui Qian, Wei-Hong Chuang, Shih-Fu Chang, Yin Cui, Boqing Gong

Figure 1 for VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text
Figure 2 for VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text
Figure 3 for VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text
Figure 4 for VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text
Viaarxiv icon