Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Steve Harris

Optimising antibiotic switching via forecasting of patient physiology

Mar 09, 2026

Magnus Ross, Nel Swanepoel, Akish Luintel, Emma McGuire, Ingemar J. Cox, Steve Harris, Vasileios Lampos

Abstract:Timely transition from intravenous (IV) to oral antibiotic therapy shortens hospital stays, reduces catheter-related infections, and lowers healthcare costs, yet one in five patients in England remain on IV antibiotics despite meeting switching criteria. Clinical decision support systems can improve switching rates, but approaches that learn from historical decisions reproduce the delays and inconsistencies of routine practice. We propose using neural processes to model vital sign trajectories probabilistically, predicting switch-readiness by comparing forecasts against clinical guidelines rather than learning from past actions, and ranking patients to prioritise clinical review. The design yields interpretable outputs, adapts to updated guidelines without retraining, and preserves clinical judgement. Validated on MIMIC-IV (US intensive care, 6,333 encounters) and UCLH (a large urban academic UK hospital group, 10,584 encounters), the system selects 2.2-3.2$\times$ more relevant patients than random. Our results demonstrate that forecasting patient physiology offers a principled foundation for decision support in antibiotic stewardship.

* 32 pages, 8 figures

Via

Access Paper or Ask Questions

The hidden risks of temporal resampling in clinical reinforcement learning

Feb 06, 2026

Thomas Frost, Hrisheekesh Vaidya, Steve Harris

Abstract:Offline reinforcement learning (ORL) has shown potential for improving decision-making in healthcare. However, contemporary research typically aggregates patient data into fixed time intervals, simplifying their mapping to standard ORL frameworks. The impact of these temporal manipulations on model safety and efficacy remains poorly understood. In this work, using both a gridworld navigation task and the UVA/Padova clinical diabetes simulator, we demonstrate that temporal resampling significantly degrades the performance of offline reinforcement learning algorithms during live deployment. We propose three mechanisms that drive this failure: (i) the generation of counterfactual trajectories, (ii) the distortion of temporal expectations, and (iii) the compounding of generalisation errors. Crucially, we find that standard off-policy evaluation metrics can fail to detect these drops in performance. Our findings reveal a fundamental risk in current healthcare ORL pipelines and emphasise the need for methods that explicitly handle the irregular timing of clinical decision-making.

* 12 pages, 4 figures. Currently under submission to npj Digital Medicine

Via

Access Paper or Ask Questions

Robust Real-Time Mortality Prediction in the Intensive Care Unit using Temporal Difference Learning

Nov 06, 2024

Thomas Frost, Kezhi Li, Steve Harris

Figure 1 for Robust Real-Time Mortality Prediction in the Intensive Care Unit using Temporal Difference Learning

Figure 2 for Robust Real-Time Mortality Prediction in the Intensive Care Unit using Temporal Difference Learning

Figure 3 for Robust Real-Time Mortality Prediction in the Intensive Care Unit using Temporal Difference Learning

Figure 4 for Robust Real-Time Mortality Prediction in the Intensive Care Unit using Temporal Difference Learning

Abstract:The task of predicting long-term patient outcomes using supervised machine learning is a challenging one, in part because of the high variance of each patient's trajectory, which can result in the model over-fitting to the training data. Temporal difference (TD) learning, a common reinforcement learning technique, may reduce variance by generalising learning to the pattern of state transitions rather than terminal outcomes. However, in healthcare this method requires several strong assumptions about patient states, and there appears to be limited literature evaluating the performance of TD learning against traditional supervised learning methods for long-term health outcome prediction tasks. In this study, we define a framework for applying TD learning to real-time irregularly sampled time series data using a Semi-Markov Reward Process. We evaluate the model framework in predicting intensive care mortality and show that TD learning under this framework can result in improved model robustness compared to standard supervised learning methods. and that this robustness is maintained even when validated on external datasets. This approach may offer a more reliable method when learning to predict patient outcomes using high-variance irregular time series data.

* To be published in the Proceedings of the 4th Machine Learning for Health symposium, Proceedings of Machine Learning Research (PMLR)

Via

Access Paper or Ask Questions