Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Attention Based Natural Language Grounding by Navigating Virtual Environment

Apr 23, 2018

Abhishek Sinha, Akilesh B, Mausoom Sarkar, Balaji Krishnamurthy

Figure 1 for Attention Based Natural Language Grounding by Navigating Virtual Environment

Figure 2 for Attention Based Natural Language Grounding by Navigating Virtual Environment

Figure 3 for Attention Based Natural Language Grounding by Navigating Virtual Environment

Figure 4 for Attention Based Natural Language Grounding by Navigating Virtual Environment

Share this with someone who'll enjoy it:

Abstract:In this work, we focus on the problem of grounding language by training an agent to follow a set of natural language instructions and navigate to a target object in an environment. The agent receives visual information through raw pixels and a natural language instruction telling what task needs to be achieved. Other than these two sources of information, our model does not have any prior information of both the visual and textual modalities and is end-to-end trainable. We develop an attention mechanism for multi-modal fusion of visual and textual modalities that allows the agent to learn to complete the task and also achieve language grounding. Our experimental results show that our attention mechanism outperforms the existing multi-modal fusion mechanisms proposed for both 2D and 3D environments in order to solve the above mentioned task. We show that the learnt textual representations are semantically meaningful as they follow vector arithmetic and are also consistent enough to induce translation between instructions in different natural languages. We also show that our model generalizes effectively to unseen scenarios and exhibit \textit{zero-shot} generalization capabilities both in 2D and 3D environments. The code for our 2D environment as well as the models that we developed for both 2D and 3D are available at \href{https://github.com/rl-lang-grounding/rl-lang-ground}{https://github.com/rl-lang-grounding/rl-lang-ground}

View paper on

Share this with someone who'll enjoy it:

Title:Attention Based Natural Language Grounding by Navigating Virtual Environment

Paper and Code