Alert button
Picture for Nick Rhinehart

Nick Rhinehart

Alert button

The Waymo Open Sim Agents Challenge

May 19, 2023
Nico Montali, John Lambert, Paul Mougin, Alex Kuefler, Nick Rhinehart, Michelle Li, Cole Gulino, Tristan Emrich, Zoey Yang, Shimon Whiteson, Brandyn White, Dragomir Anguelov

Figure 1 for The Waymo Open Sim Agents Challenge
Figure 2 for The Waymo Open Sim Agents Challenge
Figure 3 for The Waymo Open Sim Agents Challenge
Figure 4 for The Waymo Open Sim Agents Challenge

In this work, we define the Waymo Open Sim Agents Challenge (WOSAC). Simulation with realistic, interactive agents represents a key task for autonomous vehicle software development. WOSAC is the first public challenge to tackle this task and propose corresponding metrics. The goal of the challenge is to stimulate the design of realistic simulators that can be used to evaluate and train a behavior model for autonomous driving. We outline our evaluation methodology and present preliminary results for a number of different baseline simulation agent methods.

Viaarxiv icon

Offline Reinforcement Learning for Visual Navigation

Dec 16, 2022
Dhruv Shah, Arjun Bhorkar, Hrish Leen, Ilya Kostrikov, Nick Rhinehart, Sergey Levine

Figure 1 for Offline Reinforcement Learning for Visual Navigation
Figure 2 for Offline Reinforcement Learning for Visual Navigation
Figure 3 for Offline Reinforcement Learning for Visual Navigation
Figure 4 for Offline Reinforcement Learning for Visual Navigation

Reinforcement learning can enable robots to navigate to distant goals while optimizing user-specified reward functions, including preferences for following lanes, staying on paved paths, or avoiding freshly mowed grass. However, online learning from trial-and-error for real-world robots is logistically challenging, and methods that instead can utilize existing datasets of robotic navigation data could be significantly more scalable and enable broader generalization. In this paper, we present ReViND, the first offline RL system for robotic navigation that can leverage previously collected data to optimize user-specified reward functions in the real-world. We evaluate our system for off-road navigation without any additional data collection or fine-tuning, and show that it can navigate to distant goals using only offline training from this dataset, and exhibit behaviors that qualitatively differ based on the user-specified reward function.

* Project page https://sites.google.com/view/revind/home 
Viaarxiv icon