Picture for Yehao Li

Yehao Li

Scheduled Sampling in Vision-Language Pretraining with Decoupled Encoder-Decoder Network

Add code
Jan 27, 2021
Figure 1 for Scheduled Sampling in Vision-Language Pretraining with Decoupled Encoder-Decoder Network
Figure 2 for Scheduled Sampling in Vision-Language Pretraining with Decoupled Encoder-Decoder Network
Figure 3 for Scheduled Sampling in Vision-Language Pretraining with Decoupled Encoder-Decoder Network
Figure 4 for Scheduled Sampling in Vision-Language Pretraining with Decoupled Encoder-Decoder Network
Viaarxiv icon

Pre-training for Video Captioning Challenge 2020 Summary

Add code
Jul 27, 2020
Figure 1 for Pre-training for Video Captioning Challenge 2020 Summary
Figure 2 for Pre-training for Video Captioning Challenge 2020 Summary
Figure 3 for Pre-training for Video Captioning Challenge 2020 Summary
Viaarxiv icon

Auto-captions on GIF: A Large-scale Video-sentence Dataset for Vision-language Pre-training

Add code
Jul 05, 2020
Figure 1 for Auto-captions on GIF: A Large-scale Video-sentence Dataset for Vision-language Pre-training
Figure 2 for Auto-captions on GIF: A Large-scale Video-sentence Dataset for Vision-language Pre-training
Figure 3 for Auto-captions on GIF: A Large-scale Video-sentence Dataset for Vision-language Pre-training
Figure 4 for Auto-captions on GIF: A Large-scale Video-sentence Dataset for Vision-language Pre-training
Viaarxiv icon

Exploring Category-Agnostic Clusters for Open-Set Domain Adaptation

Add code
Jun 11, 2020
Figure 1 for Exploring Category-Agnostic Clusters for Open-Set Domain Adaptation
Figure 2 for Exploring Category-Agnostic Clusters for Open-Set Domain Adaptation
Figure 3 for Exploring Category-Agnostic Clusters for Open-Set Domain Adaptation
Figure 4 for Exploring Category-Agnostic Clusters for Open-Set Domain Adaptation
Viaarxiv icon

X-Linear Attention Networks for Image Captioning

Add code
Mar 31, 2020
Figure 1 for X-Linear Attention Networks for Image Captioning
Figure 2 for X-Linear Attention Networks for Image Captioning
Figure 3 for X-Linear Attention Networks for Image Captioning
Figure 4 for X-Linear Attention Networks for Image Captioning
Viaarxiv icon

Multi-Source Domain Adaptation and Semi-Supervised Domain Adaptation with Focus on Visual Domain Adaptation Challenge 2019

Add code
Oct 14, 2019
Figure 1 for Multi-Source Domain Adaptation and Semi-Supervised Domain Adaptation with Focus on Visual Domain Adaptation Challenge 2019
Figure 2 for Multi-Source Domain Adaptation and Semi-Supervised Domain Adaptation with Focus on Visual Domain Adaptation Challenge 2019
Figure 3 for Multi-Source Domain Adaptation and Semi-Supervised Domain Adaptation with Focus on Visual Domain Adaptation Challenge 2019
Figure 4 for Multi-Source Domain Adaptation and Semi-Supervised Domain Adaptation with Focus on Visual Domain Adaptation Challenge 2019
Viaarxiv icon

Hierarchy Parsing for Image Captioning

Add code
Sep 10, 2019
Figure 1 for Hierarchy Parsing for Image Captioning
Figure 2 for Hierarchy Parsing for Image Captioning
Figure 3 for Hierarchy Parsing for Image Captioning
Figure 4 for Hierarchy Parsing for Image Captioning
Viaarxiv icon

Deep Metric Learning with Density Adaptivity

Add code
Sep 09, 2019
Figure 1 for Deep Metric Learning with Density Adaptivity
Figure 2 for Deep Metric Learning with Density Adaptivity
Figure 3 for Deep Metric Learning with Density Adaptivity
Figure 4 for Deep Metric Learning with Density Adaptivity
Viaarxiv icon

Trimmed Action Recognition, Dense-Captioning Events in Videos, and Spatio-temporal Action Localization with Focus on ActivityNet Challenge 2019

Add code
Jun 14, 2019
Figure 1 for Trimmed Action Recognition, Dense-Captioning Events in Videos, and Spatio-temporal Action Localization with Focus on ActivityNet Challenge 2019
Figure 2 for Trimmed Action Recognition, Dense-Captioning Events in Videos, and Spatio-temporal Action Localization with Focus on ActivityNet Challenge 2019
Figure 3 for Trimmed Action Recognition, Dense-Captioning Events in Videos, and Spatio-temporal Action Localization with Focus on ActivityNet Challenge 2019
Figure 4 for Trimmed Action Recognition, Dense-Captioning Events in Videos, and Spatio-temporal Action Localization with Focus on ActivityNet Challenge 2019
Viaarxiv icon

Temporal Deformable Convolutional Encoder-Decoder Networks for Video Captioning

Add code
May 03, 2019
Figure 1 for Temporal Deformable Convolutional Encoder-Decoder Networks for Video Captioning
Figure 2 for Temporal Deformable Convolutional Encoder-Decoder Networks for Video Captioning
Figure 3 for Temporal Deformable Convolutional Encoder-Decoder Networks for Video Captioning
Figure 4 for Temporal Deformable Convolutional Encoder-Decoder Networks for Video Captioning
Viaarxiv icon