Picture for Luowei Zhou

Luowei Zhou

UC2: Universal Cross-lingual Cross-modal Vision-and-Language Pre-training

Add code
Apr 01, 2021
Figure 1 for UC2: Universal Cross-lingual Cross-modal Vision-and-Language Pre-training
Figure 2 for UC2: Universal Cross-lingual Cross-modal Vision-and-Language Pre-training
Figure 3 for UC2: Universal Cross-lingual Cross-modal Vision-and-Language Pre-training
Figure 4 for UC2: Universal Cross-lingual Cross-modal Vision-and-Language Pre-training
Viaarxiv icon

Less is More: ClipBERT for Video-and-Language Learning via Sparse Sampling

Add code
Feb 11, 2021
Figure 1 for Less is More: ClipBERT for Video-and-Language Learning via Sparse Sampling
Figure 2 for Less is More: ClipBERT for Video-and-Language Learning via Sparse Sampling
Figure 3 for Less is More: ClipBERT for Video-and-Language Learning via Sparse Sampling
Figure 4 for Less is More: ClipBERT for Video-and-Language Learning via Sparse Sampling
Viaarxiv icon

Temporally Guided Articulated Hand Pose Tracking in Surgical Videos

Add code
Jan 12, 2021
Figure 1 for Temporally Guided Articulated Hand Pose Tracking in Surgical Videos
Figure 2 for Temporally Guided Articulated Hand Pose Tracking in Surgical Videos
Figure 3 for Temporally Guided Articulated Hand Pose Tracking in Surgical Videos
Figure 4 for Temporally Guided Articulated Hand Pose Tracking in Surgical Videos
Viaarxiv icon

Cluster-Former: Clustering-based Sparse Transformer for Long-Range Dependency Encoding

Add code
Sep 13, 2020
Figure 1 for Cluster-Former: Clustering-based Sparse Transformer for Long-Range Dependency Encoding
Figure 2 for Cluster-Former: Clustering-based Sparse Transformer for Long-Range Dependency Encoding
Figure 3 for Cluster-Former: Clustering-based Sparse Transformer for Long-Range Dependency Encoding
Figure 4 for Cluster-Former: Clustering-based Sparse Transformer for Long-Range Dependency Encoding
Viaarxiv icon

Unified Vision-Language Pre-Training for Image Captioning and VQA

Add code
Oct 03, 2019
Figure 1 for Unified Vision-Language Pre-Training for Image Captioning and VQA
Figure 2 for Unified Vision-Language Pre-Training for Image Captioning and VQA
Figure 3 for Unified Vision-Language Pre-Training for Image Captioning and VQA
Figure 4 for Unified Vision-Language Pre-Training for Image Captioning and VQA
Viaarxiv icon

Grounded Video Description

Add code
Dec 17, 2018
Figure 1 for Grounded Video Description
Figure 2 for Grounded Video Description
Figure 3 for Grounded Video Description
Figure 4 for Grounded Video Description
Viaarxiv icon

Dynamic Graph Modules for Modeling Higher-Order Interactions in Activity Recognition

Add code
Dec 13, 2018
Figure 1 for Dynamic Graph Modules for Modeling Higher-Order Interactions in Activity Recognition
Figure 2 for Dynamic Graph Modules for Modeling Higher-Order Interactions in Activity Recognition
Figure 3 for Dynamic Graph Modules for Modeling Higher-Order Interactions in Activity Recognition
Figure 4 for Dynamic Graph Modules for Modeling Higher-Order Interactions in Activity Recognition
Viaarxiv icon

Weakly-Supervised Video Object Grounding from Text by Loss Weighting and Object Interaction

Add code
Jul 20, 2018
Figure 1 for Weakly-Supervised Video Object Grounding from Text by Loss Weighting and Object Interaction
Figure 2 for Weakly-Supervised Video Object Grounding from Text by Loss Weighting and Object Interaction
Figure 3 for Weakly-Supervised Video Object Grounding from Text by Loss Weighting and Object Interaction
Figure 4 for Weakly-Supervised Video Object Grounding from Text by Loss Weighting and Object Interaction
Viaarxiv icon

End-to-End Dense Video Captioning with Masked Transformer

Add code
Apr 03, 2018
Figure 1 for End-to-End Dense Video Captioning with Masked Transformer
Figure 2 for End-to-End Dense Video Captioning with Masked Transformer
Figure 3 for End-to-End Dense Video Captioning with Masked Transformer
Figure 4 for End-to-End Dense Video Captioning with Masked Transformer
Viaarxiv icon

Towards Automatic Learning of Procedures from Web Instructional Videos

Add code
Nov 21, 2017
Figure 1 for Towards Automatic Learning of Procedures from Web Instructional Videos
Figure 2 for Towards Automatic Learning of Procedures from Web Instructional Videos
Figure 3 for Towards Automatic Learning of Procedures from Web Instructional Videos
Figure 4 for Towards Automatic Learning of Procedures from Web Instructional Videos
Viaarxiv icon