Creative Problem Solving (CPS) is a sub-area within Artificial Intelligence (AI) that focuses on methods for solving off-nominal, or anomalous problems in autonomous systems. Despite many advancements in planning and learning, resolving novel problems or adapting existing knowledge to a new context, especially in cases where the environment may change in unpredictable ways post deployment, remains a limiting factor in the safe and useful integration of intelligent systems. The emergence of increasingly autonomous systems dictates the necessity for AI agents to deal with environmental uncertainty through creativity. To stimulate further research in CPS, we present a definition and a framework of CPS, which we adopt to categorize existing AI methods in this field. Our framework consists of four main components of a CPS problem, namely, 1) problem formulation, 2) knowledge representation, 3) method of knowledge manipulation, and 4) method of evaluation. We conclude our survey with open research questions, and suggested directions for the future.
Intelligent decision support (IDS) systems leverage artificial intelligence techniques to generate recommendations that guide human users through the decision making phases of a task. However, a key challenge is that IDS systems are not perfect, and in complex real-world scenarios may produce incorrect output or fail to work altogether. The field of explainable AI planning (XAIP) has sought to develop techniques that make the decision making of sequential decision making AI systems more explainable to end-users. Critically, prior work in applying XAIP techniques to IDS systems has assumed that the plan being proposed by the planner is always optimal, and therefore the action or plan being recommended as decision support to the user is always correct. In this work, we examine novice user interactions with a non-robust IDS system -- one that occasionally recommends the wrong action, and one that may become unavailable after users have become accustomed to its guidance. We introduce a novel explanation type, subgoal-based explanations, for planning-based IDS systems, that supplements traditional IDS output with information about the subgoal toward which the recommended action would contribute. We demonstrate that subgoal-based explanations lead to improved user task performance, improve user ability to distinguish optimal and suboptimal IDS recommendations, are preferred by users, and enable more robust user performance in the case of IDS failure
Object grounding tasks aim to locate the target object in an image through verbal communications. Understanding human command is an important process needed for effective human-robot communication. However, this is challenging because human commands can be ambiguous and erroneous. This paper aims to disambiguate the human's referring expressions by allowing the agent to ask relevant questions based on semantic data obtained from scene graphs. We test if our agent can use relations between objects from a scene graph to ask semantically relevant questions that can disambiguate the original user command. In this paper, we present Incremental Grounding using Scene Graphs (IGSG), a disambiguation model that uses semantic data from an image scene graph and linguistic structures from a language scene graph to ground objects based on human command. Compared to the baseline, IGSG shows promising results in complex real-world scenes where there are multiple identical target objects. IGSG can effectively disambiguate ambiguous or wrong referring expressions by asking disambiguating questions back to the user.
When interacting in unstructured human environments, occasional robot failures are inevitable. When such failures occur, everyday people, rather than trained technicians, will be the first to respond. Existing natural language explanations hand-annotate contextual information from an environment to help everyday people understand robot failures. However, this methodology lacks generalizability and scalability. In our work, we introduce a more generalizable semantic explanation framework. Our framework autonomously captures the semantic information in a scene to produce semantically descriptive explanations for everyday users. To generate failure-focused explanations that are semantically grounded, we leverages both semantic scene graphs to extract spatial relations and object attributes from an environment, as well as pairwise ranking. Our results show that these semantically descriptive explanations significantly improve everyday users' ability to both identify failures and provide assistance for recovery than the existing state-of-the-art context-based explanations.
Multi-robot task allocation (MRTA) problems involve optimizing the allocation of robots to tasks. MRTA problems are known to be challenging when tasks require multiple robots and the team is composed of heterogeneous robots. These challenges are further exacerbated when we need to account for uncertainties encountered in the real-world. In this work, we address coalition formation in heterogeneous multi-robot teams with uncertain capabilities. We specifically focus on tasks that require coalitions to collectively satisfy certain minimum requirements. Existing approaches to uncertainty-aware task allocation either maximize expected pay-off (risk-neutral approaches) or improve worst-case or near-worst-case outcomes (risk-averse approaches). Within the context of our problem, we demonstrate the inherent limitations of unilaterally ignoring or avoiding risk and show that these approaches can in fact reduce the probability of satisfying task requirements. Inspired by models that explain foraging behaviors in animals, we develop a risk-adaptive approach to task allocation. Our approach adaptively switches between risk-averse and risk-seeking behavior in order to maximize the probability of satisfying task requirements. Comprehensive numerical experiments conclusively demonstrate that our risk-adaptive approach outperforms risk-neutral and risk-averse approaches. We also demonstrate the effectiveness of our approach using a simulated multi-robot emergency response scenario.
To realize effective heterogeneous multi-robot teams, researchers must leverage individual robots' relative strengths and coordinate their individual behaviors. Specifically, heterogeneous multi-robot systems must answer three important questions: \textit{who} (task allocation), \textit{when} (scheduling), and \textit{how} (motion planning). While specific variants of each of these problems are known to be NP-Hard, their interdependence only exacerbates the challenges involved in solving them together. In this paper, we present a novel framework that interleaves task allocation, scheduling, and motion planning. We introduce a search-based approach for trait-based time-extended task allocation named Incremental Task Allocation Graph Search (ITAGS). In contrast to approaches that solve the three problems in sequence, ITAGS's interleaved approach enables efficient search for allocations while simultaneously satisfying scheduling constraints and accounting for the time taken to execute motion plans. To enable effective interleaving, we develop a convex combination of two search heuristics that optimizes the satisfaction of task requirements as well as the makespan of the associated schedule. We demonstrate the efficacy of ITAGS using detailed ablation studies and comparisons against two state-of-the-art algorithms in a simulated emergency response domain.
Smart home environments are designed to provide services that help improve the quality of life for the occupant via a variety of sensors and actuators installed throughout the space. Many automated actions taken by a smart home are governed by the output of an underlying activity recognition system. However, activity recognition systems may not be perfectly accurate and therefore inconsistencies in smart home operations can lead a user to wonder "why did the smart home do that?" In this work, we build on insights from Explainable Artificial Intelligence (XAI) techniques to contribute computational methods for explainable activity recognition. Specifically, we generate explanations for smart home activity recognition systems that explain what about an activity led to the given classification. To do so, we introduce four computational techniques for generating natural language explanations of smart home data and compare their effectiveness at generating meaningful explanations. Through a study with everyday users, we evaluate user preferences towards the four explanation types. Our results show that the leading approach, SHAP, has a 92% success rate in generating accurate explanations. Moreover, 84% of sampled scenarios users preferred natural language explanations over a simple activity label, underscoring the need for explainable activity recognition systems. Finally, we show that explanations generated by some XAI methods can lead users to lose confidence in the accuracy of the underlying activity recognition model, while others lead users to gain confidence. Taking all studied factors into consideration, we make a recommendation regarding which existing XAI method leads to the best performance in the domain of smart home automation, and discuss a range of topics for future work in this area.
Requiring multiple demonstrations of a task plan presents a burden to end-users of robots. However, robustly executing tasks plans from a single end-user demonstration is an ongoing challenge in robotics. We address the problem of one-shot task execution, in which a robot must generalize a single demonstration or prototypical example of a task plan to a new execution environment. Our approach integrates task plans with domain knowledge to infer task plan constituents for new execution environments. Our experimental evaluations show that our knowledge representation makes more relevant generalizations that result in significantly higher success rates over tested baselines. We validated the approach on a physical platform, which resulted in the successful generalization of initial task plans to 38 of 50 execution environments with errors resulting from autonomous robot operation included.
In recent years, there has been a resurgence in methods that use distributed (neural) representations to represent and reason about semantic knowledge for robotics applications. However, while robots often observe previously unknown concepts, these representations typically assume that all concepts are known a priori, and incorporating new information requires all concepts to be learned afresh. Our work relaxes the static assumptions of these representations to tackle the incremental knowledge graph embedding problem by leveraging principles of a range of continual learning methods. Through an experimental evaluation with several knowledge graphs and embedding representations, we provide insights about trade-offs for practitioners to match a semantics-driven robotics application to a suitable continual knowledge graph embedding method.