Picture for Haoqi Fan

Haoqi Fan

MAViL: Masked Audio-Video Learners

Add code
Dec 15, 2022
Figure 1 for MAViL: Masked Audio-Video Learners
Figure 2 for MAViL: Masked Audio-Video Learners
Figure 3 for MAViL: Masked Audio-Video Learners
Figure 4 for MAViL: Masked Audio-Video Learners
Viaarxiv icon

Scaling Language-Image Pre-training via Masking

Add code
Dec 01, 2022
Figure 1 for Scaling Language-Image Pre-training via Masking
Figure 2 for Scaling Language-Image Pre-training via Masking
Figure 3 for Scaling Language-Image Pre-training via Masking
Figure 4 for Scaling Language-Image Pre-training via Masking
Viaarxiv icon

Castling-ViT: Compressing Self-Attention via Switching Towards Linear-Angular Attention During Vision Transformer Inference

Add code
Nov 18, 2022
Figure 1 for Castling-ViT: Compressing Self-Attention via Switching Towards Linear-Angular Attention During Vision Transformer Inference
Figure 2 for Castling-ViT: Compressing Self-Attention via Switching Towards Linear-Angular Attention During Vision Transformer Inference
Figure 3 for Castling-ViT: Compressing Self-Attention via Switching Towards Linear-Angular Attention During Vision Transformer Inference
Figure 4 for Castling-ViT: Compressing Self-Attention via Switching Towards Linear-Angular Attention During Vision Transformer Inference
Viaarxiv icon

Masked Autoencoders As Spatiotemporal Learners

Add code
May 18, 2022
Figure 1 for Masked Autoencoders As Spatiotemporal Learners
Figure 2 for Masked Autoencoders As Spatiotemporal Learners
Figure 3 for Masked Autoencoders As Spatiotemporal Learners
Figure 4 for Masked Autoencoders As Spatiotemporal Learners
Viaarxiv icon

On the Importance of Asymmetry for Siamese Representation Learning

Add code
Apr 01, 2022
Figure 1 for On the Importance of Asymmetry for Siamese Representation Learning
Figure 2 for On the Importance of Asymmetry for Siamese Representation Learning
Figure 3 for On the Importance of Asymmetry for Siamese Representation Learning
Figure 4 for On the Importance of Asymmetry for Siamese Representation Learning
Viaarxiv icon

Unified Transformer Tracker for Object Tracking

Add code
Mar 29, 2022
Figure 1 for Unified Transformer Tracker for Object Tracking
Figure 2 for Unified Transformer Tracker for Object Tracking
Figure 3 for Unified Transformer Tracker for Object Tracking
Figure 4 for Unified Transformer Tracker for Object Tracking
Viaarxiv icon

MeMViT: Memory-Augmented Multiscale Vision Transformer for Efficient Long-Term Video Recognition

Add code
Jan 20, 2022
Figure 1 for MeMViT: Memory-Augmented Multiscale Vision Transformer for Efficient Long-Term Video Recognition
Figure 2 for MeMViT: Memory-Augmented Multiscale Vision Transformer for Efficient Long-Term Video Recognition
Figure 3 for MeMViT: Memory-Augmented Multiscale Vision Transformer for Efficient Long-Term Video Recognition
Figure 4 for MeMViT: Memory-Augmented Multiscale Vision Transformer for Efficient Long-Term Video Recognition
Viaarxiv icon

Masked Feature Prediction for Self-Supervised Visual Pre-Training

Add code
Dec 16, 2021
Figure 1 for Masked Feature Prediction for Self-Supervised Visual Pre-Training
Figure 2 for Masked Feature Prediction for Self-Supervised Visual Pre-Training
Figure 3 for Masked Feature Prediction for Self-Supervised Visual Pre-Training
Figure 4 for Masked Feature Prediction for Self-Supervised Visual Pre-Training
Viaarxiv icon

Improved Multiscale Vision Transformers for Classification and Detection

Add code
Dec 02, 2021
Figure 1 for Improved Multiscale Vision Transformers for Classification and Detection
Figure 2 for Improved Multiscale Vision Transformers for Classification and Detection
Figure 3 for Improved Multiscale Vision Transformers for Classification and Detection
Figure 4 for Improved Multiscale Vision Transformers for Classification and Detection
Viaarxiv icon

PyTorchVideo: A Deep Learning Library for Video Understanding

Add code
Nov 18, 2021
Figure 1 for PyTorchVideo: A Deep Learning Library for Video Understanding
Figure 2 for PyTorchVideo: A Deep Learning Library for Video Understanding
Figure 3 for PyTorchVideo: A Deep Learning Library for Video Understanding
Viaarxiv icon