Alert button

Critic Sequential Monte Carlo

May 30, 2022
Vasileios Lioutas, Jonathan Wilder Lavington, Justice Sefas, Matthew Niedoba, Yunpeng Liu, Berend Zwartsenberg, Setareh Dabiri, Frank Wood, Adam Scibior

Figure 1 for Critic Sequential Monte Carlo
Figure 2 for Critic Sequential Monte Carlo
Figure 3 for Critic Sequential Monte Carlo
Figure 4 for Critic Sequential Monte Carlo

Share this with someone who'll enjoy it:

We introduce CriticSMC, a new algorithm for planning as inference built from a novel composition of sequential Monte Carlo with learned soft-Q function heuristic factors. This algorithm is structured so as to allow using large numbers of putative particles leading to efficient utilization of computational resource and effective discovery of high reward trajectories even in environments with difficult reward surfaces such as those arising from hard constraints. Relative to prior art our approach is notably still compatible with model-free reinforcement learning in the sense that the implicit policy we produce can be used at test time in the absence of a world model. Our experiments on self-driving car collision avoidance in simulation demonstrate improvements against baselines in terms of infraction minimization relative to computational effort while maintaining diversity and realism of found trajectories.

* 20 pages, 3 figures  
View paper onarxiv iconopen_review iconOpenReview

Share this with someone who'll enjoy it: