Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Nicholas Waytowich

On games and simulators as a platform for development of artificial intelligence for command and control

Oct 21, 2021

Vinicius G. Goecks, Nicholas Waytowich, Derrik E. Asher, Song Jun Park, Mark Mittrick, John Richardson, Manuel Vindiola, Anne Logie, Mark Dennison, Theron Trout(+2 more)

Figure 1 for On games and simulators as a platform for development of artificial intelligence for command and control

Figure 2 for On games and simulators as a platform for development of artificial intelligence for command and control

Figure 3 for On games and simulators as a platform for development of artificial intelligence for command and control

Figure 4 for On games and simulators as a platform for development of artificial intelligence for command and control

Abstract:Games and simulators can be a valuable platform to execute complex multi-agent, multiplayer, imperfect information scenarios with significant parallels to military applications: multiple participants manage resources and make decisions that command assets to secure specific areas of a map or neutralize opposing forces. These characteristics have attracted the artificial intelligence (AI) community by supporting development of algorithms with complex benchmarks and the capability to rapidly iterate over new ideas. The success of artificial intelligence algorithms in real-time strategy games such as StarCraft II have also attracted the attention of the military research community aiming to explore similar techniques in military counterpart scenarios. Aiming to bridge the connection between games and military applications, this work discusses past and current efforts on how games and simulators, together with the artificial intelligence algorithms, have been adapted to simulate certain aspects of military missions and how they might impact the future battlefield. This paper also investigates how advances in virtual reality and visual augmentation systems open new possibilities in human interfaces with gaming platforms and their military parallels.

* Preprint submitted to the Journal of Defense Modeling and Simulation (JDMS) for peer review

Via

Access Paper or Ask Questions

Interactive Hierarchical Guidance using Language

Oct 09, 2021

Bharat Prakash, Nicholas Waytowich, Tim Oates, Tinoosh Mohsenin

Figure 1 for Interactive Hierarchical Guidance using Language

Figure 2 for Interactive Hierarchical Guidance using Language

Figure 3 for Interactive Hierarchical Guidance using Language

Figure 4 for Interactive Hierarchical Guidance using Language

Abstract:Reinforcement learning has been successful in many tasks ranging from robotic control, games, energy management etc. In complex real world environments with sparse rewards and long task horizons, sample efficiency is still a major challenge. Most complex tasks can be easily decomposed into high-level planning and low level control. Therefore, it is important to enable agents to leverage the hierarchical structure and decompose bigger tasks into multiple smaller sub-tasks. We introduce an approach where we use language to specify sub-tasks and a high-level planner issues language commands to a low level controller. The low-level controller executes the sub-tasks based on the language commands. Our experiments show that this method is able to solve complex long horizon planning tasks with limited human supervision. Using language has added benefit of interpretability and ability for expert humans to take over the high-level planning task and provide language commands if necessary.

* Presented at AI-HRI symposium as part of AAAI-FSS 2021 (arXiv:2109.10836)

Via

Access Paper or Ask Questions

Mobile Manipulation Leveraging Multiple Views

Oct 02, 2021

David Watkins-Valls, Peter K Allen, Henrique Maia, Madhavan Seshadri, Jonathan Sanabria, Nicholas Waytowich, Jacob Varley

Figure 1 for Mobile Manipulation Leveraging Multiple Views

Figure 2 for Mobile Manipulation Leveraging Multiple Views

Figure 3 for Mobile Manipulation Leveraging Multiple Views

Figure 4 for Mobile Manipulation Leveraging Multiple Views

Abstract:While both navigation and manipulation are challenging topics in isolation, many tasks require the ability to both navigate and manipulate in concert. To this end, we propose a mobile manipulation system that leverages novel navigation and shape completion methods to manipulate an object with a mobile robot. Our system utilizes uncertainty in the initial estimation of a manipulation target to calculate a predicted next-best-view. Without the need of localization, the robot then uses the predicted panoramic view at the next-best-view location to navigate to the desired location, capture a second view of the object, create a new model that predicts the shape of object more accurately than a single image alone, and uses this model for grasp planning. We show that the system is highly effective for mobile manipulation tasks through simulation experiments using real world data, as well as ablations on each component of our system.

* 6 pages, 2 pages of references, 4 figures, 4 tables

Via

Access Paper or Ask Questions

A Narration-based Reward Shaping Approach using Grounded Natural Language Commands

Oct 31, 2019

Nicholas Waytowich, Sean L. Barton, Vernon Lawhern, Garrett Warnell

Figure 1 for A Narration-based Reward Shaping Approach using Grounded Natural Language Commands

Figure 2 for A Narration-based Reward Shaping Approach using Grounded Natural Language Commands

Figure 3 for A Narration-based Reward Shaping Approach using Grounded Natural Language Commands

Figure 4 for A Narration-based Reward Shaping Approach using Grounded Natural Language Commands

Abstract:While deep reinforcement learning techniques have led to agents that are successfully able to learn to perform a number of tasks that had been previously unlearnable, these techniques are still susceptible to the longstanding problem of reward sparsity. This is especially true for tasks such as training an agent to play StarCraft II, a real-time strategy game where reward is only given at the end of a game which is usually very long. While this problem can be addressed through reward shaping, such approaches typically require a human expert with specialized knowledge. Inspired by the vision of enabling reward shaping through the more-accessible paradigm of natural-language narration, we develop a technique that can provide the benefits of reward shaping using natural language commands. Our narration-guided RL agent projects sequences of natural-language commands into the same high-dimensional representation space as corresponding goal states. We show that we can get improved performance with our method compared to traditional reward-shaping approaches. Additionally, we demonstrate the ability of our method to generalize to unseen natural-language commands.

* Presented at the Imitation, Intent and Interaction (I3) workshop, ICML 2019. arXiv admin note: substantial text overlap with arXiv:1906.02671

Via

Access Paper or Ask Questions

Learning from Observations Using a Single Video Demonstration and Human Feedback

Sep 29, 2019

Sunil Gandhi, Tim Oates, Tinoosh Mohsenin, Nicholas Waytowich

Figure 1 for Learning from Observations Using a Single Video Demonstration and Human Feedback

Figure 2 for Learning from Observations Using a Single Video Demonstration and Human Feedback

Figure 3 for Learning from Observations Using a Single Video Demonstration and Human Feedback

Figure 4 for Learning from Observations Using a Single Video Demonstration and Human Feedback

Abstract:In this paper, we present a method for learning from video demonstrations by using human feedback to construct a mapping between the standard representation of the agent and the visual representation of the demonstration. In this way, we leverage the advantages of both these representations, i.e., we learn the policy using standard state representations, but are able to specify the expected behavior using video demonstration. We train an autonomous agent using a single video demonstration and use human feedback (using numerical similarity rating) to map the standard representation to the visual representation with a neural network. We show the effectiveness of our method by teaching a hopper agent in the MuJoCo to perform a backflip using a single video demonstration generated in MuJoCo as well as from a real-world YouTube video of a person performing a backflip. Additionally, we show that our method can transfer to new tasks, such as hopping, with very little human feedback.

Via

Access Paper or Ask Questions

Learning Your Way Without Map or Compass: Panoramic Target Driven Visual Navigation

Sep 20, 2019

David Watkins-Valls, Jingxi Xu, Nicholas Waytowich, Peter Allen

Figure 1 for Learning Your Way Without Map or Compass: Panoramic Target Driven Visual Navigation

Figure 2 for Learning Your Way Without Map or Compass: Panoramic Target Driven Visual Navigation

Figure 3 for Learning Your Way Without Map or Compass: Panoramic Target Driven Visual Navigation

Figure 4 for Learning Your Way Without Map or Compass: Panoramic Target Driven Visual Navigation

Abstract:We present a robot navigation system that uses an imitation learning framework to successfully navigate in complex environments. Our framework takes a pre-built 3D scan of a real environment and trains an agent from pre-generated expert trajectories to navigate to any position given a panoramic view of the goal and the current visual input without relying on map, compass, odometry, GPS or relative position of the target at runtime. Our end-to-end trained agent uses RGB and depth (RGBD) information and can handle large environments (up to $1031m^2$) across multiple rooms (up to $40$) and generalizes to unseen targets. We show that when compared to several baselines using deep reinforcement learning and RGBD SLAM, our method (1) requires fewer training examples and less training time, (2) reaches the goal location with higher accuracy, (3) produces better solutions with shorter paths for long-range navigation tasks, and (4) generalizes to unseen environments given an RGBD map of the environment.

Via

Access Paper or Ask Questions

Inferring and Learning Multi-Robot Policies by Observing an Expert

Sep 17, 2019

Pietro Pierpaoli, Harish Ravichandar, Nicholas Waytowich, Anqi Li, Derrik Asher, Magnus Egerstedt

Figure 1 for Inferring and Learning Multi-Robot Policies by Observing an Expert

Figure 2 for Inferring and Learning Multi-Robot Policies by Observing an Expert

Figure 3 for Inferring and Learning Multi-Robot Policies by Observing an Expert

Figure 4 for Inferring and Learning Multi-Robot Policies by Observing an Expert

Abstract:In this paper we present a technique for learning how to solve a multi-robot mission that requires interaction with an external environment by repeatedly observing an expert system executing the same mission. We define the expert system as a team of robots equipped with a library of controllers, each designed to solve a specific task, supervised by an expert policy that appropriately selects controllers based on the states of robots and environment. The objective is for an un-trained team of robots equipped with the same library of controllers, but agnostic to the expert policy, to execute the mission, with performances comparable to those of the expert system. From observations of the expert system, the Interactive Multiple Model technique is used to estimate individual controllers executed by the expert policy. Then, the history of estimated controllers and environmental state is used to learn a policy for the un-trained robots. Considering a perimeter protection scenario on a team of simulated differential-drive robots, we show that the learned policy endows the un-trained team with performances comparable to those of the expert system.

* 8 pages, 7 figures

Via

Access Paper or Ask Questions

Grounding Natural Language Commands to StarCraft II Game States for Narration-Guided Reinforcement Learning

Apr 24, 2019

Nicholas Waytowich, Sean L. Barton, Vernon Lawhern, Ethan Stump, Garrett Warnell

Abstract:While deep reinforcement learning techniques have led to agents that are successfully able to learn to perform a number of tasks that had been previously unlearnable, these techniques are still susceptible to the longstanding problem of {\em reward sparsity}. This is especially true for tasks such as training an agent to play StarCraft II, a real-time strategy game where reward is only given at the end of a game which is usually very long. While this problem can be addressed through reward shaping, such approaches typically require a human expert with specialized knowledge. Inspired by the vision of enabling reward shaping through the more-accessible paradigm of natural-language narration, we investigate to what extent we can contextualize these narrations by grounding them to the goal-specific states. We present a mutual-embedding model using a multi-input deep-neural network that projects a sequence of natural language commands into the same high-dimensional representation space as corresponding goal states. We show that using this model we can learn an embedding space with separable and distinct clusters that accurately maps natural-language commands to corresponding game states . We also discuss how this model can allow for the use of narrations as a robust form of reward shaping to improve RL performance and efficiency.

* 10 pages, 3 figures. Published at SPIE 2019

Via

Access Paper or Ask Questions

Improving Safety in Reinforcement Learning Using Model-Based Architectures and Human Intervention

Mar 22, 2019

Bharat Prakash, Mohit Khatwani, Nicholas Waytowich, Tinoosh Mohsenin

Figure 1 for Improving Safety in Reinforcement Learning Using Model-Based Architectures and Human Intervention

Figure 2 for Improving Safety in Reinforcement Learning Using Model-Based Architectures and Human Intervention

Figure 3 for Improving Safety in Reinforcement Learning Using Model-Based Architectures and Human Intervention

Figure 4 for Improving Safety in Reinforcement Learning Using Model-Based Architectures and Human Intervention

Abstract:Recent progress in AI and Reinforcement learning has shown great success in solving complex problems with high dimensional state spaces. However, most of these successes have been primarily in simulated environments where failure is of little or no consequence. Most real-world applications, however, require training solutions that are safe to operate as catastrophic failures are inadmissible especially when there is human interaction involved. Currently, Safe RL systems use human oversight during training and exploration in order to make sure the RL agent does not go into a catastrophic state. These methods require a large amount of human labor and it is very difficult to scale up. We present a hybrid method for reducing the human intervention time by combining model-based approaches and training a supervised learner to improve sample efficiency while also ensuring safety. We evaluate these methods on various grid-world environments using both standard and visual representations and show that our approach achieves better performance in terms of sample efficiency, number of catastrophic states reached as well as overall task performance compared to traditional model-free approaches

Via

Access Paper or Ask Questions

Deep TAMER: Interactive Agent Shaping in High-Dimensional State Spaces

Jan 19, 2018

Garrett Warnell, Nicholas Waytowich, Vernon Lawhern, Peter Stone

Figure 1 for Deep TAMER: Interactive Agent Shaping in High-Dimensional State Spaces

Figure 2 for Deep TAMER: Interactive Agent Shaping in High-Dimensional State Spaces

Figure 3 for Deep TAMER: Interactive Agent Shaping in High-Dimensional State Spaces

Figure 4 for Deep TAMER: Interactive Agent Shaping in High-Dimensional State Spaces

Abstract:While recent advances in deep reinforcement learning have allowed autonomous learning agents to succeed at a variety of complex tasks, existing algorithms generally require a lot of training data. One way to increase the speed at which agents are able to learn to perform tasks is by leveraging the input of human trainers. Although such input can take many forms, real-time, scalar-valued feedback is especially useful in situations where it proves difficult or impossible for humans to provide expert demonstrations. Previous approaches have shown the usefulness of human input provided in this fashion (e.g., the TAMER framework), but they have thus far not considered high-dimensional state spaces or employed the use of deep learning. In this paper, we do both: we propose Deep TAMER, an extension of the TAMER framework that leverages the representational power of deep neural networks in order to learn complex tasks in just a short amount of time with a human trainer. We demonstrate Deep TAMER's success by using it and just 15 minutes of human-provided feedback to train an agent that performs better than humans on the Atari game of Bowling - a task that has proven difficult for even state-of-the-art reinforcement learning methods.

* 9 pages, 6 figures

Via

Access Paper or Ask Questions