Picture for Tao Mei

Tao Mei

Stand-Alone Inter-Frame Attention in Video Models

Add code
Jun 14, 2022
Figure 1 for Stand-Alone Inter-Frame Attention in Video Models
Figure 2 for Stand-Alone Inter-Frame Attention in Video Models
Figure 3 for Stand-Alone Inter-Frame Attention in Video Models
Figure 4 for Stand-Alone Inter-Frame Attention in Video Models
Viaarxiv icon

Comprehending and Ordering Semantics for Image Captioning

Add code
Jun 14, 2022
Figure 1 for Comprehending and Ordering Semantics for Image Captioning
Figure 2 for Comprehending and Ordering Semantics for Image Captioning
Figure 3 for Comprehending and Ordering Semantics for Image Captioning
Figure 4 for Comprehending and Ordering Semantics for Image Captioning
Viaarxiv icon

MLP-3D: A MLP-like 3D Architecture with Grouped Time Mixing

Add code
Jun 13, 2022
Figure 1 for MLP-3D: A MLP-like 3D Architecture with Grouped Time Mixing
Figure 2 for MLP-3D: A MLP-like 3D Architecture with Grouped Time Mixing
Figure 3 for MLP-3D: A MLP-like 3D Architecture with Grouped Time Mixing
Figure 4 for MLP-3D: A MLP-like 3D Architecture with Grouped Time Mixing
Viaarxiv icon

Exploring Structure-aware Transformer over Interaction Proposals for Human-Object Interaction Detection

Add code
Jun 13, 2022
Figure 1 for Exploring Structure-aware Transformer over Interaction Proposals for Human-Object Interaction Detection
Figure 2 for Exploring Structure-aware Transformer over Interaction Proposals for Human-Object Interaction Detection
Figure 3 for Exploring Structure-aware Transformer over Interaction Proposals for Human-Object Interaction Detection
Figure 4 for Exploring Structure-aware Transformer over Interaction Proposals for Human-Object Interaction Detection
Viaarxiv icon

Silver-Bullet-3D at ManiSkill 2021: Learning-from-Demonstrations and Heuristic Rule-based Methods for Object Manipulation

Add code
Jun 13, 2022
Figure 1 for Silver-Bullet-3D at ManiSkill 2021: Learning-from-Demonstrations and Heuristic Rule-based Methods for Object Manipulation
Figure 2 for Silver-Bullet-3D at ManiSkill 2021: Learning-from-Demonstrations and Heuristic Rule-based Methods for Object Manipulation
Figure 3 for Silver-Bullet-3D at ManiSkill 2021: Learning-from-Demonstrations and Heuristic Rule-based Methods for Object Manipulation
Figure 4 for Silver-Bullet-3D at ManiSkill 2021: Learning-from-Demonstrations and Heuristic Rule-based Methods for Object Manipulation
Viaarxiv icon

Structured Two-stream Attention Network for Video Question Answering

Add code
Jun 02, 2022
Figure 1 for Structured Two-stream Attention Network for Video Question Answering
Figure 2 for Structured Two-stream Attention Network for Video Question Answering
Figure 3 for Structured Two-stream Attention Network for Video Question Answering
Figure 4 for Structured Two-stream Attention Network for Video Question Answering
Viaarxiv icon

Gait Recognition in the Wild with Dense 3D Representations and A Benchmark

Add code
Apr 06, 2022
Figure 1 for Gait Recognition in the Wild with Dense 3D Representations and A Benchmark
Figure 2 for Gait Recognition in the Wild with Dense 3D Representations and A Benchmark
Figure 3 for Gait Recognition in the Wild with Dense 3D Representations and A Benchmark
Figure 4 for Gait Recognition in the Wild with Dense 3D Representations and A Benchmark
Viaarxiv icon

A-ACT: Action Anticipation through Cycle Transformations

Add code
Apr 02, 2022
Figure 1 for A-ACT: Action Anticipation through Cycle Transformations
Figure 2 for A-ACT: Action Anticipation through Cycle Transformations
Figure 3 for A-ACT: Action Anticipation through Cycle Transformations
Figure 4 for A-ACT: Action Anticipation through Cycle Transformations
Viaarxiv icon

Visualizing and Understanding Patch Interactions in Vision Transformer

Add code
Mar 11, 2022
Figure 1 for Visualizing and Understanding Patch Interactions in Vision Transformer
Figure 2 for Visualizing and Understanding Patch Interactions in Vision Transformer
Figure 3 for Visualizing and Understanding Patch Interactions in Vision Transformer
Figure 4 for Visualizing and Understanding Patch Interactions in Vision Transformer
Viaarxiv icon

Part-level Action Parsing via a Pose-guided Coarse-to-Fine Framework

Add code
Mar 09, 2022
Figure 1 for Part-level Action Parsing via a Pose-guided Coarse-to-Fine Framework
Figure 2 for Part-level Action Parsing via a Pose-guided Coarse-to-Fine Framework
Figure 3 for Part-level Action Parsing via a Pose-guided Coarse-to-Fine Framework
Viaarxiv icon