Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:CNN+CNN: Convolutional Decoders for Image Captioning

May 23, 2018

Qingzhong Wang, Antoni B. Chan

Figure 1 for CNN+CNN: Convolutional Decoders for Image Captioning

Figure 2 for CNN+CNN: Convolutional Decoders for Image Captioning

Figure 3 for CNN+CNN: Convolutional Decoders for Image Captioning

Figure 4 for CNN+CNN: Convolutional Decoders for Image Captioning

Share this with someone who'll enjoy it:

Abstract:Image captioning is a challenging task that combines the field of computer vision and natural language processing. A variety of approaches have been proposed to achieve the goal of automatically describing an image, and recurrent neural network (RNN) or long-short term memory (LSTM) based models dominate this field. However, RNNs or LSTMs cannot be calculated in parallel and ignore the underlying hierarchical structure of a sentence. In this paper, we propose a framework that only employs convolutional neural networks (CNNs) to generate captions. Owing to parallel computing, our basic model is around 3 times faster than NIC (an LSTM-based model) during training time, while also providing better results. We conduct extensive experiments on MSCOCO and investigate the influence of the model width and depth. Compared with LSTM-based models that apply similar attention mechanisms, our proposed models achieves comparable scores of BLEU-1,2,3,4 and METEOR, and higher scores of CIDEr. We also test our model on the paragraph annotation dataset, and get higher CIDEr score compared with hierarchical LSTMs

View paper on

Share this with someone who'll enjoy it:

Title:CNN+CNN: Convolutional Decoders for Image Captioning

Paper and Code