Picture for Xitong Yang

Xitong Yang

Vision Transformers Are Good Mask Auto-Labelers

Add code
Jan 10, 2023
Viaarxiv icon

ASM-Loc: Action-aware Segment Modeling for Weakly-Supervised Temporal Action Localization

Add code
Mar 29, 2022
Figure 1 for ASM-Loc: Action-aware Segment Modeling for Weakly-Supervised Temporal Action Localization
Figure 2 for ASM-Loc: Action-aware Segment Modeling for Weakly-Supervised Temporal Action Localization
Figure 3 for ASM-Loc: Action-aware Segment Modeling for Weakly-Supervised Temporal Action Localization
Figure 4 for ASM-Loc: Action-aware Segment Modeling for Weakly-Supervised Temporal Action Localization
Viaarxiv icon

Efficient Video Transformers with Spatial-Temporal Token Selection

Add code
Nov 23, 2021
Figure 1 for Efficient Video Transformers with Spatial-Temporal Token Selection
Figure 2 for Efficient Video Transformers with Spatial-Temporal Token Selection
Figure 3 for Efficient Video Transformers with Spatial-Temporal Token Selection
Figure 4 for Efficient Video Transformers with Spatial-Temporal Token Selection
Viaarxiv icon

Semi-Supervised Vision Transformers

Add code
Nov 22, 2021
Figure 1 for Semi-Supervised Vision Transformers
Figure 2 for Semi-Supervised Vision Transformers
Figure 3 for Semi-Supervised Vision Transformers
Figure 4 for Semi-Supervised Vision Transformers
Viaarxiv icon

Beyond Short Clips: End-to-End Video-Level Learning with Collaborative Memories

Add code
Apr 02, 2021
Figure 1 for Beyond Short Clips: End-to-End Video-Level Learning with Collaborative Memories
Figure 2 for Beyond Short Clips: End-to-End Video-Level Learning with Collaborative Memories
Figure 3 for Beyond Short Clips: End-to-End Video-Level Learning with Collaborative Memories
Figure 4 for Beyond Short Clips: End-to-End Video-Level Learning with Collaborative Memories
Viaarxiv icon

GTA: Global Temporal Attention for Video Action Understanding

Add code
Dec 15, 2020
Figure 1 for GTA: Global Temporal Attention for Video Action Understanding
Figure 2 for GTA: Global Temporal Attention for Video Action Understanding
Figure 3 for GTA: Global Temporal Attention for Video Action Understanding
Figure 4 for GTA: Global Temporal Attention for Video Action Understanding
Viaarxiv icon

Hierarchical Contrastive Motion Learning for Video Action Recognition

Add code
Jul 20, 2020
Figure 1 for Hierarchical Contrastive Motion Learning for Video Action Recognition
Figure 2 for Hierarchical Contrastive Motion Learning for Video Action Recognition
Figure 3 for Hierarchical Contrastive Motion Learning for Video Action Recognition
Figure 4 for Hierarchical Contrastive Motion Learning for Video Action Recognition
Viaarxiv icon

A Generic Visualization Approach for Convolutional Neural Networks

Add code
Jul 19, 2020
Figure 1 for A Generic Visualization Approach for Convolutional Neural Networks
Figure 2 for A Generic Visualization Approach for Convolutional Neural Networks
Figure 3 for A Generic Visualization Approach for Convolutional Neural Networks
Figure 4 for A Generic Visualization Approach for Convolutional Neural Networks
Viaarxiv icon

Cross-X Learning for Fine-Grained Visual Categorization

Add code
Sep 10, 2019
Figure 1 for Cross-X Learning for Fine-Grained Visual Categorization
Figure 2 for Cross-X Learning for Fine-Grained Visual Categorization
Figure 3 for Cross-X Learning for Fine-Grained Visual Categorization
Figure 4 for Cross-X Learning for Fine-Grained Visual Categorization
Viaarxiv icon

STEP: Spatio-Temporal Progressive Learning for Video Action Detection

Add code
Apr 19, 2019
Figure 1 for STEP: Spatio-Temporal Progressive Learning for Video Action Detection
Figure 2 for STEP: Spatio-Temporal Progressive Learning for Video Action Detection
Figure 3 for STEP: Spatio-Temporal Progressive Learning for Video Action Detection
Figure 4 for STEP: Spatio-Temporal Progressive Learning for Video Action Detection
Viaarxiv icon