Picture for Shih-Fu Chang

Shih-Fu Chang

Columbia University

Supervised Masked Knowledge Distillation for Few-Shot Transformers

Add code
Mar 29, 2023
Figure 1 for Supervised Masked Knowledge Distillation for Few-Shot Transformers
Figure 2 for Supervised Masked Knowledge Distillation for Few-Shot Transformers
Figure 3 for Supervised Masked Knowledge Distillation for Few-Shot Transformers
Figure 4 for Supervised Masked Knowledge Distillation for Few-Shot Transformers
Viaarxiv icon

DiGeo: Discriminative Geometry-Aware Learning for Generalized Few-Shot Object Detection

Add code
Mar 16, 2023
Figure 1 for DiGeo: Discriminative Geometry-Aware Learning for Generalized Few-Shot Object Detection
Figure 2 for DiGeo: Discriminative Geometry-Aware Learning for Generalized Few-Shot Object Detection
Figure 3 for DiGeo: Discriminative Geometry-Aware Learning for Generalized Few-Shot Object Detection
Figure 4 for DiGeo: Discriminative Geometry-Aware Learning for Generalized Few-Shot Object Detection
Viaarxiv icon

In Defense of Structural Symbolic Representation for Video Event-Relation Prediction

Add code
Jan 06, 2023
Viaarxiv icon

TempCLR: Temporal Alignment Representation with Contrastive Learning

Add code
Dec 28, 2022
Viaarxiv icon

Find Someone Who: Visual Commonsense Understanding in Human-Centric Grounding

Add code
Dec 14, 2022
Figure 1 for Find Someone Who: Visual Commonsense Understanding in Human-Centric Grounding
Figure 2 for Find Someone Who: Visual Commonsense Understanding in Human-Centric Grounding
Figure 3 for Find Someone Who: Visual Commonsense Understanding in Human-Centric Grounding
Figure 4 for Find Someone Who: Visual Commonsense Understanding in Human-Centric Grounding
Viaarxiv icon

Understanding ME? Multimodal Evaluation for Fine-grained Visual Commonsense

Add code
Nov 10, 2022
Figure 1 for Understanding ME? Multimodal Evaluation for Fine-grained Visual Commonsense
Figure 2 for Understanding ME? Multimodal Evaluation for Fine-grained Visual Commonsense
Figure 3 for Understanding ME? Multimodal Evaluation for Fine-grained Visual Commonsense
Figure 4 for Understanding ME? Multimodal Evaluation for Fine-grained Visual Commonsense
Viaarxiv icon

Video Event Extraction via Tracking Visual States of Arguments

Add code
Nov 05, 2022
Viaarxiv icon

Video in 10 Bits: Few-Bit VideoQA for Efficiency and Privacy

Add code
Oct 18, 2022
Figure 1 for Video in 10 Bits: Few-Bit VideoQA for Efficiency and Privacy
Figure 2 for Video in 10 Bits: Few-Bit VideoQA for Efficiency and Privacy
Figure 3 for Video in 10 Bits: Few-Bit VideoQA for Efficiency and Privacy
Figure 4 for Video in 10 Bits: Few-Bit VideoQA for Efficiency and Privacy
Viaarxiv icon

Learning Visual Representation from Modality-Shared Contrastive Language-Image Pre-training

Add code
Jul 26, 2022
Figure 1 for Learning Visual Representation from Modality-Shared Contrastive Language-Image Pre-training
Figure 2 for Learning Visual Representation from Modality-Shared Contrastive Language-Image Pre-training
Figure 3 for Learning Visual Representation from Modality-Shared Contrastive Language-Image Pre-training
Figure 4 for Learning Visual Representation from Modality-Shared Contrastive Language-Image Pre-training
Viaarxiv icon

Multimodal Event Graphs: Towards Event Centric Understanding of Multimodal World

Add code
Jun 14, 2022
Figure 1 for Multimodal Event Graphs: Towards Event Centric Understanding of Multimodal World
Figure 2 for Multimodal Event Graphs: Towards Event Centric Understanding of Multimodal World
Figure 3 for Multimodal Event Graphs: Towards Event Centric Understanding of Multimodal World
Figure 4 for Multimodal Event Graphs: Towards Event Centric Understanding of Multimodal World
Viaarxiv icon