Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jonathan Erskine

Active Query Selection for Crowd-Based Reinforcement Learning

Aug 26, 2025

Jonathan Erskine, Taku Yamagata, Raúl Santos-Rodríguez

Abstract:Preference-based reinforcement learning has gained prominence as a strategy for training agents in environments where the reward signal is difficult to specify or misaligned with human intent. However, its effectiveness is often limited by the high cost and low availability of reliable human input, especially in domains where expert feedback is scarce or errors are costly. To address this, we propose a novel framework that combines two complementary strategies: probabilistic crowd modelling to handle noisy, multi-annotator feedback, and active learning to prioritize feedback on the most informative agent actions. We extend the Advise algorithm to support multiple trainers, estimate their reliability online, and incorporate entropy-based query selection to guide feedback requests. We evaluate our approach in a set of environments that span both synthetic and real-world-inspired settings, including 2D games (Taxi, Pacman, Frozen Lake) and a blood glucose control task for Type 1 Diabetes using the clinically approved UVA/Padova simulator. Our preliminary results demonstrate that agents trained with feedback on uncertain trajectories exhibit faster learning in most tasks, and we outperform the baselines for the blood glucose control task.

* 7 pages, 4 figures, 2 tables plus appendices

Via

Access Paper or Ask Questions

Learning Confidence Bounds for Classification with Imbalanced Data

Jul 16, 2024

Matt Clifford, Jonathan Erskine, Alexander Hepburn, Raúl Santos-Rodríguez, Dario Garcia-Garcia

Figure 1 for Learning Confidence Bounds for Classification with Imbalanced Data

Figure 2 for Learning Confidence Bounds for Classification with Imbalanced Data

Figure 3 for Learning Confidence Bounds for Classification with Imbalanced Data

Figure 4 for Learning Confidence Bounds for Classification with Imbalanced Data

Abstract:Class imbalance poses a significant challenge in classification tasks, where traditional approaches often lead to biased models and unreliable predictions. Undersampling and oversampling techniques have been commonly employed to address this issue, yet they suffer from inherent limitations stemming from their simplistic approach such as loss of information and additional biases respectively. In this paper, we propose a novel framework that leverages learning theory and concentration inequalities to overcome the shortcomings of traditional solutions. We focus on understanding the uncertainty in a class-dependent manner, as captured by confidence bounds that we directly embed into the learning process. By incorporating class-dependent estimates, our method can effectively adapt to the varying degrees of imbalance across different classes, resulting in more robust and reliable classification outcomes. We empirically show how our framework provides a promising direction for handling imbalanced data in classification tasks, offering practitioners a valuable tool for building more accurate and trustworthy models.

* Accepted at ECAI 2024 main track

Via

Access Paper or Ask Questions

An Interactive Human-Machine Learning Interface for Collecting and Learning from Complex Annotations

Mar 28, 2024

Jonathan Erskine, Matt Clifford, Alexander Hepburn, Raúl Santos-Rodríguez

Abstract:Human-Computer Interaction has been shown to lead to improvements in machine learning systems by boosting model performance, accelerating learning and building user confidence. In this work, we aim to alleviate the expectation that human annotators adapt to the constraints imposed by traditional labels by allowing for extra flexibility in the form that supervision information is collected. For this, we propose a human-machine learning interface for binary classification tasks which enables human annotators to utilise counterfactual examples to complement standard binary labels as annotations for a dataset. Finally we discuss the challenges in future extensions of this work.

* 4 pages, 2 figures, Submitted to IJCAI 2024 Demonstration Track

Via

Access Paper or Ask Questions