Alert button
Picture for Brandyn White

Brandyn White

Alert button

The Waymo Open Sim Agents Challenge

May 19, 2023
Nico Montali, John Lambert, Paul Mougin, Alex Kuefler, Nick Rhinehart, Michelle Li, Cole Gulino, Tristan Emrich, Zoey Yang, Shimon Whiteson, Brandyn White, Dragomir Anguelov

Figure 1 for The Waymo Open Sim Agents Challenge
Figure 2 for The Waymo Open Sim Agents Challenge
Figure 3 for The Waymo Open Sim Agents Challenge
Figure 4 for The Waymo Open Sim Agents Challenge

In this work, we define the Waymo Open Sim Agents Challenge (WOSAC). Simulation with realistic, interactive agents represents a key task for autonomous vehicle software development. WOSAC is the first public challenge to tackle this task and propose corresponding metrics. The goal of the challenge is to stimulate the design of realistic simulators that can be used to evaluate and train a behavior model for autonomous driving. We outline our evaluation methodology and present preliminary results for a number of different baseline simulation agent methods.

Viaarxiv icon

Imitation Is Not Enough: Robustifying Imitation with Reinforcement Learning for Challenging Driving Scenarios

Dec 21, 2022
Yiren Lu, Justin Fu, George Tucker, Xinlei Pan, Eli Bronstein, Becca Roelofs, Benjamin Sapp, Brandyn White, Aleksandra Faust, Shimon Whiteson, Dragomir Anguelov, Sergey Levine

Figure 1 for Imitation Is Not Enough: Robustifying Imitation with Reinforcement Learning for Challenging Driving Scenarios
Figure 2 for Imitation Is Not Enough: Robustifying Imitation with Reinforcement Learning for Challenging Driving Scenarios
Figure 3 for Imitation Is Not Enough: Robustifying Imitation with Reinforcement Learning for Challenging Driving Scenarios
Figure 4 for Imitation Is Not Enough: Robustifying Imitation with Reinforcement Learning for Challenging Driving Scenarios

Imitation learning (IL) is a simple and powerful way to use high-quality human driving data, which can be collected at scale, to identify driving preferences and produce human-like behavior. However, policies based on imitation learning alone often fail to sufficiently account for safety and reliability concerns. In this paper, we show how imitation learning combined with reinforcement learning using simple rewards can substantially improve the safety and reliability of driving policies over those learned from imitation alone. In particular, we use a combination of imitation and reinforcement learning to train a policy on over 100k miles of urban driving data, and measure its effectiveness in test scenarios grouped by different levels of collision risk. To our knowledge, this is the first application of a combined imitation and reinforcement learning approach in autonomous driving that utilizes large amounts of real-world human driving data.

Viaarxiv icon

Hierarchical Model-Based Imitation Learning for Planning in Autonomous Driving

Oct 18, 2022
Eli Bronstein, Mark Palatucci, Dominik Notz, Brandyn White, Alex Kuefler, Yiren Lu, Supratik Paul, Payam Nikdel, Paul Mougin, Hongge Chen, Justin Fu, Austin Abrams, Punit Shah, Evan Racah, Benjamin Frenkel, Shimon Whiteson, Dragomir Anguelov

Figure 1 for Hierarchical Model-Based Imitation Learning for Planning in Autonomous Driving
Figure 2 for Hierarchical Model-Based Imitation Learning for Planning in Autonomous Driving
Figure 3 for Hierarchical Model-Based Imitation Learning for Planning in Autonomous Driving
Figure 4 for Hierarchical Model-Based Imitation Learning for Planning in Autonomous Driving

We demonstrate the first large-scale application of model-based generative adversarial imitation learning (MGAIL) to the task of dense urban self-driving. We augment standard MGAIL using a hierarchical model to enable generalization to arbitrary goal routes, and measure performance using a closed-loop evaluation framework with simulated interactive agents. We train policies from expert trajectories collected from real vehicles driving over 100,000 miles in San Francisco, and demonstrate a steerable policy that can navigate robustly even in a zero-shot setting, generalizing to synthetic scenarios with novel goals that never occurred in real-world driving. We also demonstrate the importance of mixing closed-loop MGAIL losses with open-loop behavior cloning losses, and show our best policy approaches the performance of the expert. We evaluate our imitative model in both average and challenging scenarios, and show how it can serve as a useful prior to plan successful trajectories.

* IROS 2022 
Viaarxiv icon

StopNet: Scalable Trajectory and Occupancy Prediction for Urban Autonomous Driving

Jun 02, 2022
Jinkyu Kim, Reza Mahjourian, Scott Ettinger, Mayank Bansal, Brandyn White, Ben Sapp, Dragomir Anguelov

Figure 1 for StopNet: Scalable Trajectory and Occupancy Prediction for Urban Autonomous Driving
Figure 2 for StopNet: Scalable Trajectory and Occupancy Prediction for Urban Autonomous Driving
Figure 3 for StopNet: Scalable Trajectory and Occupancy Prediction for Urban Autonomous Driving
Figure 4 for StopNet: Scalable Trajectory and Occupancy Prediction for Urban Autonomous Driving

We introduce a motion forecasting (behavior prediction) method that meets the latency requirements for autonomous driving in dense urban environments without sacrificing accuracy. A whole-scene sparse input representation allows StopNet to scale to predicting trajectories for hundreds of road agents with reliable latency. In addition to predicting trajectories, our scene encoder lends itself to predicting whole-scene probabilistic occupancy grids, a complementary output representation suitable for busy urban environments. Occupancy grids allow the AV to reason collectively about the behavior of groups of agents without processing their individual trajectories. We demonstrate the effectiveness of our sparse input representation and our model in terms of computation and accuracy over three datasets. We further show that co-training consistent trajectory and occupancy predictions improves upon state-of-the-art performance under standard metrics.

* IEEE International Conference on Robotics and Automation 2022  
Viaarxiv icon

Symphony: Learning Realistic and Diverse Agents for Autonomous Driving Simulation

May 06, 2022
Maximilian Igl, Daewoo Kim, Alex Kuefler, Paul Mougin, Punit Shah, Kyriacos Shiarlis, Dragomir Anguelov, Mark Palatucci, Brandyn White, Shimon Whiteson

Figure 1 for Symphony: Learning Realistic and Diverse Agents for Autonomous Driving Simulation
Figure 2 for Symphony: Learning Realistic and Diverse Agents for Autonomous Driving Simulation
Figure 3 for Symphony: Learning Realistic and Diverse Agents for Autonomous Driving Simulation
Figure 4 for Symphony: Learning Realistic and Diverse Agents for Autonomous Driving Simulation

Simulation is a crucial tool for accelerating the development of autonomous vehicles. Making simulation realistic requires models of the human road users who interact with such cars. Such models can be obtained by applying learning from demonstration (LfD) to trajectories observed by cars already on the road. However, existing LfD methods are typically insufficient, yielding policies that frequently collide or drive off the road. To address this problem, we propose Symphony, which greatly improves realism by combining conventional policies with a parallel beam search. The beam search refines these policies on the fly by pruning branches that are unfavourably evaluated by a discriminator. However, it can also harm diversity, i.e., how well the agents cover the entire distribution of realistic behaviour, as pruning can encourage mode collapse. Symphony addresses this issue with a hierarchical approach, factoring agent behaviour into goal generation and goal conditioning. The use of such goals ensures that agent diversity neither disappears during adversarial training nor is pruned away by the beam search. Experiments on both proprietary and open Waymo datasets confirm that Symphony agents learn more realistic and diverse behaviour than several baselines.

* Accepted to ICRA-2022 
Viaarxiv icon