Abstract:Self-adaptive robots adjust their behaviors in response to unpredictable environmental changes. These robots often incorporate deep learning (DL) components into their software to support functionality such as perception, decision-making, and control, enhancing autonomy and self-adaptability. However, the inherent uncertainty of DL-enabled software makes it challenging to ensure its dependability in dynamic environments. Consequently, test generation techniques have been developed to test robot software, and classical mutation analysis injects faults into the software to assess the test suite's effectiveness in detecting the resulting failures. However, there is a lack of mutation analysis techniques to assess the effectiveness under the uncertainty inherent to DL-enabled software. To this end, we propose UAMTERS, an uncertainty-aware mutation analysis framework that introduces uncertainty-aware mutation operators to explicitly inject stochastic uncertainty into DL-enabled robotic software, simulating uncertainty in its behavior. We further propose mutation score metrics to quantify a test suite's ability to detect failures under varying levels of uncertainty. We evaluate UAMTERS across three robotic case studies, demonstrating that UAMTERS more effectively distinguishes test suite quality and captures uncertainty-induced failures in DL-enabled software.




Abstract:Self-adaptive robots (SARs) in complex, uncertain environments must proactively detect and address abnormal behaviors, including out-of-distribution (OOD) cases. To this end, digital twins offer a valuable solution for OOD detection. Thus, we present a digital twin-based approach for OOD detection (ODiSAR) in SARs. ODiSAR uses a Transformer-based digital twin to forecast SAR states and employs reconstruction error and Monte Carlo dropout for uncertainty quantification. By combining reconstruction error with predictive variance, the digital twin effectively detects OOD behaviors, even in previously unseen conditions. The digital twin also includes an explainability layer that links potential OOD to specific SAR states, offering insights for self-adaptation. We evaluated ODiSAR by creating digital twins of two industrial robots: one navigating an office environment, and another performing maritime ship navigation. In both cases, ODiSAR forecasts SAR behaviors (i.e., robot trajectories and vessel motion) and proactively detects OOD events. Our results showed that ODiSAR achieved high detection performance -- up to 98\% AUROC, 96\% TNR@TPR95, and 95\% F1-score -- while providing interpretable insights to support self-adaptation.