Picture for Zi-Yi Dou

Zi-Yi Dou

ACQUIRED: A Dataset for Answering Counterfactual Questions In Real-Life Videos

Add code
Nov 02, 2023
Figure 1 for ACQUIRED: A Dataset for Answering Counterfactual Questions In Real-Life Videos
Figure 2 for ACQUIRED: A Dataset for Answering Counterfactual Questions In Real-Life Videos
Figure 3 for ACQUIRED: A Dataset for Answering Counterfactual Questions In Real-Life Videos
Figure 4 for ACQUIRED: A Dataset for Answering Counterfactual Questions In Real-Life Videos
Viaarxiv icon

DesCo: Learning Object Recognition with Rich Language Descriptions

Add code
Jun 24, 2023
Figure 1 for DesCo: Learning Object Recognition with Rich Language Descriptions
Figure 2 for DesCo: Learning Object Recognition with Rich Language Descriptions
Figure 3 for DesCo: Learning Object Recognition with Rich Language Descriptions
Figure 4 for DesCo: Learning Object Recognition with Rich Language Descriptions
Viaarxiv icon

Gender Biases in Automatic Evaluation Metrics: A Case Study on Image Captioning

Add code
May 24, 2023
Figure 1 for Gender Biases in Automatic Evaluation Metrics: A Case Study on Image Captioning
Figure 2 for Gender Biases in Automatic Evaluation Metrics: A Case Study on Image Captioning
Figure 3 for Gender Biases in Automatic Evaluation Metrics: A Case Study on Image Captioning
Figure 4 for Gender Biases in Automatic Evaluation Metrics: A Case Study on Image Captioning
Viaarxiv icon

Masked Path Modeling for Vision-and-Language Navigation

Add code
May 23, 2023
Figure 1 for Masked Path Modeling for Vision-and-Language Navigation
Figure 2 for Masked Path Modeling for Vision-and-Language Navigation
Figure 3 for Masked Path Modeling for Vision-and-Language Navigation
Figure 4 for Masked Path Modeling for Vision-and-Language Navigation
Viaarxiv icon

Generalized Decoding for Pixel, Image, and Language

Add code
Dec 21, 2022
Figure 1 for Generalized Decoding for Pixel, Image, and Language
Figure 2 for Generalized Decoding for Pixel, Image, and Language
Figure 3 for Generalized Decoding for Pixel, Image, and Language
Figure 4 for Generalized Decoding for Pixel, Image, and Language
Viaarxiv icon

Coarse-to-Fine Vision-Language Pre-training with Fusion in the Backbone

Add code
Jun 15, 2022
Figure 1 for Coarse-to-Fine Vision-Language Pre-training with Fusion in the Backbone
Figure 2 for Coarse-to-Fine Vision-Language Pre-training with Fusion in the Backbone
Figure 3 for Coarse-to-Fine Vision-Language Pre-training with Fusion in the Backbone
Figure 4 for Coarse-to-Fine Vision-Language Pre-training with Fusion in the Backbone
Viaarxiv icon

FOAM: A Follower-aware Speaker Model For Vision-and-Language Navigation

Add code
Jun 09, 2022
Figure 1 for FOAM: A Follower-aware Speaker Model For Vision-and-Language Navigation
Figure 2 for FOAM: A Follower-aware Speaker Model For Vision-and-Language Navigation
Figure 3 for FOAM: A Follower-aware Speaker Model For Vision-and-Language Navigation
Figure 4 for FOAM: A Follower-aware Speaker Model For Vision-and-Language Navigation
Viaarxiv icon

Zero-shot Commonsense Question Answering with Cloze Translation and Consistency Optimization

Add code
Jan 01, 2022
Figure 1 for Zero-shot Commonsense Question Answering with Cloze Translation and Consistency Optimization
Figure 2 for Zero-shot Commonsense Question Answering with Cloze Translation and Consistency Optimization
Figure 3 for Zero-shot Commonsense Question Answering with Cloze Translation and Consistency Optimization
Figure 4 for Zero-shot Commonsense Question Answering with Cloze Translation and Consistency Optimization
Viaarxiv icon

An Empirical Study of Training End-to-End Vision-and-Language Transformers

Add code
Nov 25, 2021
Figure 1 for An Empirical Study of Training End-to-End Vision-and-Language Transformers
Figure 2 for An Empirical Study of Training End-to-End Vision-and-Language Transformers
Figure 3 for An Empirical Study of Training End-to-End Vision-and-Language Transformers
Figure 4 for An Empirical Study of Training End-to-End Vision-and-Language Transformers
Viaarxiv icon

RefSum: Refactoring Neural Summarization

Add code
Apr 15, 2021
Figure 1 for RefSum: Refactoring Neural Summarization
Figure 2 for RefSum: Refactoring Neural Summarization
Figure 3 for RefSum: Refactoring Neural Summarization
Figure 4 for RefSum: Refactoring Neural Summarization
Viaarxiv icon