Picture for Rameswar Panda

Rameswar Panda

Richard

Dynamic Network Quantization for Efficient Video Inference

Add code
Aug 23, 2021
Figure 1 for Dynamic Network Quantization for Efficient Video Inference
Figure 2 for Dynamic Network Quantization for Efficient Video Inference
Figure 3 for Dynamic Network Quantization for Efficient Video Inference
Figure 4 for Dynamic Network Quantization for Efficient Video Inference
Viaarxiv icon

An Image Classifier Can Suffice For Video Understanding

Add code
Jun 30, 2021
Figure 1 for An Image Classifier Can Suffice For Video Understanding
Figure 2 for An Image Classifier Can Suffice For Video Understanding
Figure 3 for An Image Classifier Can Suffice For Video Understanding
Figure 4 for An Image Classifier Can Suffice For Video Understanding
Viaarxiv icon

IA-RED$^2$: Interpretability-Aware Redundancy Reduction for Vision Transformers

Add code
Jun 23, 2021
Figure 1 for IA-RED$^2$: Interpretability-Aware Redundancy Reduction for Vision Transformers
Figure 2 for IA-RED$^2$: Interpretability-Aware Redundancy Reduction for Vision Transformers
Figure 3 for IA-RED$^2$: Interpretability-Aware Redundancy Reduction for Vision Transformers
Figure 4 for IA-RED$^2$: Interpretability-Aware Redundancy Reduction for Vision Transformers
Viaarxiv icon

Dynamic Distillation Network for Cross-Domain Few-Shot Recognition with Unlabeled Data

Add code
Jun 14, 2021
Figure 1 for Dynamic Distillation Network for Cross-Domain Few-Shot Recognition with Unlabeled Data
Figure 2 for Dynamic Distillation Network for Cross-Domain Few-Shot Recognition with Unlabeled Data
Figure 3 for Dynamic Distillation Network for Cross-Domain Few-Shot Recognition with Unlabeled Data
Figure 4 for Dynamic Distillation Network for Cross-Domain Few-Shot Recognition with Unlabeled Data
Viaarxiv icon

RegionViT: Regional-to-Local Attention for Vision Transformers

Add code
Jun 04, 2021
Figure 1 for RegionViT: Regional-to-Local Attention for Vision Transformers
Figure 2 for RegionViT: Regional-to-Local Attention for Vision Transformers
Figure 3 for RegionViT: Regional-to-Local Attention for Vision Transformers
Figure 4 for RegionViT: Regional-to-Local Attention for Vision Transformers
Viaarxiv icon

AdaMML: Adaptive Multi-Modal Learning for Efficient Video Recognition

Add code
May 12, 2021
Figure 1 for AdaMML: Adaptive Multi-Modal Learning for Efficient Video Recognition
Figure 2 for AdaMML: Adaptive Multi-Modal Learning for Efficient Video Recognition
Figure 3 for AdaMML: Adaptive Multi-Modal Learning for Efficient Video Recognition
Figure 4 for AdaMML: Adaptive Multi-Modal Learning for Efficient Video Recognition
Viaarxiv icon

Multimodal Clustering Networks for Self-supervised Learning from Unlabeled Videos

Add code
May 05, 2021
Figure 1 for Multimodal Clustering Networks for Self-supervised Learning from Unlabeled Videos
Figure 2 for Multimodal Clustering Networks for Self-supervised Learning from Unlabeled Videos
Figure 3 for Multimodal Clustering Networks for Self-supervised Learning from Unlabeled Videos
Figure 4 for Multimodal Clustering Networks for Self-supervised Learning from Unlabeled Videos
Viaarxiv icon

Detector-Free Weakly Supervised Grounding by Separation

Add code
Apr 20, 2021
Figure 1 for Detector-Free Weakly Supervised Grounding by Separation
Figure 2 for Detector-Free Weakly Supervised Grounding by Separation
Figure 3 for Detector-Free Weakly Supervised Grounding by Separation
Figure 4 for Detector-Free Weakly Supervised Grounding by Separation
Viaarxiv icon

A Broad Study on the Transferability of Visual Representations with Contrastive Learning

Add code
Apr 01, 2021
Figure 1 for A Broad Study on the Transferability of Visual Representations with Contrastive Learning
Figure 2 for A Broad Study on the Transferability of Visual Representations with Contrastive Learning
Figure 3 for A Broad Study on the Transferability of Visual Representations with Contrastive Learning
Figure 4 for A Broad Study on the Transferability of Visual Representations with Contrastive Learning
Viaarxiv icon

CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification

Add code
Mar 27, 2021
Figure 1 for CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification
Figure 2 for CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification
Figure 3 for CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification
Figure 4 for CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification
Viaarxiv icon