Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:On the General Value of Evidence, and Bilingual Scene-Text Visual Question Answering

Feb 26, 2020

Xinyu Wang, Yuliang Liu, Chunhua Shen, Chun Chet Ng, Canjie Luo, Lianwen Jin, Chee Seng Chan, Anton van den Hengel, Liangwei Wang

Figure 1 for On the General Value of Evidence, and Bilingual Scene-Text Visual Question Answering

Figure 2 for On the General Value of Evidence, and Bilingual Scene-Text Visual Question Answering

Figure 3 for On the General Value of Evidence, and Bilingual Scene-Text Visual Question Answering

Figure 4 for On the General Value of Evidence, and Bilingual Scene-Text Visual Question Answering

Share this with someone who'll enjoy it:

Abstract:Visual Question Answering (VQA) methods have made incredible progress, but suffer from a failure to generalize. This is visible in the fact that they are vulnerable to learning coincidental correlations in the data rather than deeper relations between image content and ideas expressed in language. We present a dataset that takes a step towards addressing this problem in that it contains questions expressed in two languages, and an evaluation process that co-opts a well understood image-based metric to reflect the method's ability to reason. Measuring reasoning directly encourages generalization by penalizing answers that are coincidentally correct. The dataset reflects the scene-text version of the VQA problem, and the reasoning evaluation can be seen as a text-based version of a referring expression challenge. Experiments and analysis are provided that show the value of the dataset.

* Accepted to Proc. IEEE Conf. Computer Vision and Pattern Recognition 2020

View paper on

Share this with someone who'll enjoy it:

Title:On the General Value of Evidence, and Bilingual Scene-Text Visual Question Answering

Paper and Code