Alert button

A Closer Look at Advantage-Filtered Behavioral Cloning in High-Noise Datasets

Oct 10, 2021
Jake Grigsby, Yanjun Qi

Figure 1 for A Closer Look at Advantage-Filtered Behavioral Cloning in High-Noise Datasets
Figure 2 for A Closer Look at Advantage-Filtered Behavioral Cloning in High-Noise Datasets
Figure 3 for A Closer Look at Advantage-Filtered Behavioral Cloning in High-Noise Datasets
Figure 4 for A Closer Look at Advantage-Filtered Behavioral Cloning in High-Noise Datasets

Share this with someone who'll enjoy it:

Recent Offline Reinforcement Learning methods have succeeded in learning high-performance policies from fixed datasets of experience. A particularly effective approach learns to first identify and then mimic optimal decision-making strategies. Our work evaluates this method's ability to scale to vast datasets consisting almost entirely of sub-optimal noise. A thorough investigation on a custom benchmark helps identify several key challenges involved in learning from high-noise datasets. We re-purpose prioritized experience sampling to locate expert-level demonstrations among millions of low-performance samples. This modification enables offline agents to learn state-of-the-art policies in benchmark tasks using datasets where expert actions are outnumbered nearly 65:1.

* Honors Undergraduate Thesis, UVA 2021. 15 pages  
View paper onarxiv icon

Share this with someone who'll enjoy it: