Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ashish Sundar

Gaze Into the Abyss -- Planning to Seek Entropy When Reward is Scarce

May 22, 2025

Ashish Sundar, Chunbo Luo, Xiaoyang Wang

Abstract:Model-based reinforcement learning (MBRL) offers an intuitive way to increase the sample efficiency of model-free RL methods by simultaneously training a world model that learns to predict the future. MBRL methods have progressed by largely prioritising the actor; optimising the world model learning has been neglected meanwhile. Improving the fidelity of the world model and reducing its time to convergence can yield significant downstream benefits, one of which is improving the ensuing performance of any actor it may train. We propose a novel approach that anticipates and actively seeks out high-entropy states using short-horizon latent predictions generated by the world model, offering a principled alternative to traditional curiosity-driven methods that chase once-novel states well after they were stumbled into. While many model predictive control (MPC) based methods offer similar alternatives, they typically lack commitment, synthesising multi step plans after every step. To mitigate this, we present a hierarchical planner that dynamically decides when to replan, planning horizon length, and the weighting between reward and entropy. While our method can theoretically be applied to any model that trains its own actors with solely model generated data, we have applied it to just Dreamer as a proof of concept. Our method finishes the Miniworld procedurally generated mazes 50% faster than base Dreamer at convergence and the policy trained in imagination converges in only 60% of the environment steps that base Dreamer needs.

* 9 pages without appendix, 15 Figures, preprint

Via

Access Paper or Ask Questions

ECG Feature Importance Rankings: Cardiologists vs. Algorithms

Apr 05, 2023

Temesgen Mehari, Ashish Sundar, Alen Bosnjakovic, Peter Harris, Steven E. Williams, Axel Loewe, Olaf Doessel, Claudia Nagel, Nils Strodthoff, Philip J. Aston

Figure 1 for ECG Feature Importance Rankings: Cardiologists vs. Algorithms

Figure 2 for ECG Feature Importance Rankings: Cardiologists vs. Algorithms

Figure 3 for ECG Feature Importance Rankings: Cardiologists vs. Algorithms

Figure 4 for ECG Feature Importance Rankings: Cardiologists vs. Algorithms

Abstract:Feature importance methods promise to provide a ranking of features according to importance for a given classification task. A wide range of methods exist but their rankings often disagree and they are inherently difficult to evaluate due to a lack of ground truth beyond synthetic datasets. In this work, we put feature importance methods to the test on real-world data in the domain of cardiology, where we try to distinguish three specific pathologies from healthy subjects based on ECG features comparing to features used in cardiologists' decision rules as ground truth. Some methods generally performed well and others performed poorly, while some methods did well on some but not all of the problems considered.

Via

Access Paper or Ask Questions