Picture for Zhiyuan Fang

Zhiyuan Fang

SEED: Self-supervised Distillation For Visual Representation

Add code
Jan 12, 2021
Figure 1 for SEED: Self-supervised Distillation For Visual Representation
Figure 2 for SEED: Self-supervised Distillation For Visual Representation
Figure 3 for SEED: Self-supervised Distillation For Visual Representation
Figure 4 for SEED: Self-supervised Distillation For Visual Representation
Viaarxiv icon

Weak Supervision and Referring Attention for Temporal-Textual Association Learning

Add code
Jun 27, 2020
Figure 1 for Weak Supervision and Referring Attention for Temporal-Textual Association Learning
Figure 2 for Weak Supervision and Referring Attention for Temporal-Textual Association Learning
Figure 3 for Weak Supervision and Referring Attention for Temporal-Textual Association Learning
Figure 4 for Weak Supervision and Referring Attention for Temporal-Textual Association Learning
Viaarxiv icon

HRDNet: High-resolution Detection Network for Small Objects

Add code
Jun 13, 2020
Figure 1 for HRDNet: High-resolution Detection Network for Small Objects
Figure 2 for HRDNet: High-resolution Detection Network for Small Objects
Figure 3 for HRDNet: High-resolution Detection Network for Small Objects
Figure 4 for HRDNet: High-resolution Detection Network for Small Objects
Viaarxiv icon

ViTAA: Visual-Textual Attributes Alignment in Person Search by Natural Language

Add code
May 15, 2020
Figure 1 for ViTAA: Visual-Textual Attributes Alignment in Person Search by Natural Language
Figure 2 for ViTAA: Visual-Textual Attributes Alignment in Person Search by Natural Language
Figure 3 for ViTAA: Visual-Textual Attributes Alignment in Person Search by Natural Language
Figure 4 for ViTAA: Visual-Textual Attributes Alignment in Person Search by Natural Language
Viaarxiv icon

Video2Commonsense: Generating Commonsense Descriptions to Enrich Video Captioning

Add code
Mar 17, 2020
Figure 1 for Video2Commonsense: Generating Commonsense Descriptions to Enrich Video Captioning
Figure 2 for Video2Commonsense: Generating Commonsense Descriptions to Enrich Video Captioning
Figure 3 for Video2Commonsense: Generating Commonsense Descriptions to Enrich Video Captioning
Figure 4 for Video2Commonsense: Generating Commonsense Descriptions to Enrich Video Captioning
Viaarxiv icon

Blocksworld Revisited: Learning and Reasoning to Generate Event-Sequences from Image Pairs

Add code
May 28, 2019
Figure 1 for Blocksworld Revisited: Learning and Reasoning to Generate Event-Sequences from Image Pairs
Figure 2 for Blocksworld Revisited: Learning and Reasoning to Generate Event-Sequences from Image Pairs
Figure 3 for Blocksworld Revisited: Learning and Reasoning to Generate Event-Sequences from Image Pairs
Figure 4 for Blocksworld Revisited: Learning and Reasoning to Generate Event-Sequences from Image Pairs
Viaarxiv icon

Modularized Textual Grounding for Counterfactual Resilience

Add code
Apr 07, 2019
Figure 1 for Modularized Textual Grounding for Counterfactual Resilience
Figure 2 for Modularized Textual Grounding for Counterfactual Resilience
Figure 3 for Modularized Textual Grounding for Counterfactual Resilience
Figure 4 for Modularized Textual Grounding for Counterfactual Resilience
Viaarxiv icon

Weakly Supervised Attention Learning for Textual Phrases Grounding

Add code
May 01, 2018
Figure 1 for Weakly Supervised Attention Learning for Textual Phrases Grounding
Figure 2 for Weakly Supervised Attention Learning for Textual Phrases Grounding
Figure 3 for Weakly Supervised Attention Learning for Textual Phrases Grounding
Figure 4 for Weakly Supervised Attention Learning for Textual Phrases Grounding
Viaarxiv icon

Range Loss for Deep Face Recognition with Long-tail

Add code
Nov 28, 2016
Figure 1 for Range Loss for Deep Face Recognition with Long-tail
Figure 2 for Range Loss for Deep Face Recognition with Long-tail
Figure 3 for Range Loss for Deep Face Recognition with Long-tail
Figure 4 for Range Loss for Deep Face Recognition with Long-tail
Viaarxiv icon