Abstract:Beliefs and values are increasingly being incorporated into our AI systems through alignment processes, such as carefully curating data collection principles or regularizing the loss function used for training. However, the meta-alignment problem is that these human beliefs are diverse and not aligned across populations; furthermore, the implicit strength of each belief may not be well calibrated even among humans, especially when trying to generalize across contexts. Specifically, in high regret situations, we observe that contextual counterfactuals and recourse costs are particularly important in updating a decision maker's beliefs and the strengths to which such beliefs are held. Therefore, we argue that including counterfactuals is key to an accurate calibration of beliefs during alignment. To do this, we first segment belief diversity into two categories: subjectivity (across individuals within a population) and epistemic uncertainty (within an individual across different contexts). By leveraging our notion of epistemic uncertainty, we introduce `the belief calibration cycle' framework to more holistically calibrate this diversity of beliefs with context-driven counterfactual reasoning by using a multi-objective optimization. We empirically apply our framework for finding a Pareto frontier of clustered optimal belief strengths that generalize across different contexts, demonstrating its efficacy on a toy dataset for credit decisions.
Abstract:To collaborate well with robots, we must be able to understand their decision making. Humans naturally infer other agents' beliefs and desires by reasoning about their observable behavior in a way that resembles inverse reinforcement learning (IRL). Thus, robots can convey their beliefs and desires by providing demonstrations that are informative for a human's IRL. An informative demonstration is one that differs strongly from the learner's expectations of what the robot will do given their current understanding of the robot's decision making. However, standard IRL does not model the learner's existing expectations, and thus cannot do this counterfactual reasoning. We propose to incorporate the learner's current understanding of the robot's decision making into our model of human IRL, so that our robot can select demonstrations that maximize the human's understanding. We also propose a novel measure for estimating the difficulty for a human to predict instances of a robot's behavior in unseen environments. A user study finds that our test difficulty measure correlates well with human performance and confidence. Interestingly, considering human beliefs and counterfactuals when selecting demonstrations decreases human performance on easy tests, but increases performance on difficult tests, providing insight on how to best utilize such models.
Abstract:Validation accuracy is a necessary, but not sufficient, measure of a neural network classifier's quality. High validation accuracy during development does not guarantee that a model is free of serious flaws, such as vulnerability to adversarial attacks or a tendency to misclassify (with high confidence) data it was not trained on. The model may also be incomprehensible to a human or base its decisions on unreasonable criteria. These problems, which are not unique to classifiers, have been the focus of a substantial amount of recent research. However, they are not prioritized during model development, which almost always optimizes on validation accuracy to the exclusion of everything else. The product of this approach is likely to fail in unexpected ways outside of the training environment. We believe that, in addition to validation accuracy, the model development process must give added weight to other performance metrics such as explainability, resistance to adversarial attacks, and overconfidence on out-of-distribution data.
Abstract:Radiation detection has largely been a manual inspection process with point sensors such as Geiger-Muller counters and scintillation spectrometers to date. While their observations of source proximity prove useful, they lack the directional information necessary for efficient source localization and characterization in cluttered environments with multiple radiation sources. The recent commercialization of Compton gamma cameras provides directional information to the broader radiation detection community for the first time. This paper presents the integration of a Compton gamma camera with a self-localizing ground robot for accurate 3D radiation mapping. Using the position and orientation of the robot, radiation images from the gamma camera are accumulated over a traversed path in a shared frame of reference to construct a consistent voxel grid-based radiation map. The peaks of the map at pre-specified energy windows are selected as the source location estimates, which are compared to the ground truth source locations. The proposed approach localizes multiple sources to within an average of 0.2 m in two 5 x 4 m^2 and 14 x 6 m^2 laboratory environments.