Communication robots have the potential to contribute to effective human-XAI interaction as an interface that goes beyond textual or graphical explanations. One of their strengths is that they can use physical and vocal expressions to add detailed nuances to explanations. However, it is not clear how a robot can apply such expressions, or in particular, how we can develop a strategy to adaptively use such expressions depending on the task and user in dynamic interactions. To address this question, this paper proposes DynEmph, a method for a communication robot to decide where to emphasize XAI-generated explanations with physical expressions. It predicts the effect of emphasizing certain points on a user and aims to minimize the expected difference between predicted user decisions and AI-suggested ones. DynEmph features a strategy for deciding where to emphasize in a data-driven manner, relieving engineers from the need to manually design a strategy. We further conducted experiments to investigate how emphasis selection strategies affect the performance of user decisions. The results suggest that, while a naive strategy (emphasizing explanations for an AI's most probable class) does not necessarily work better, DynEmph effectively guides users to better decisions under the condition that the performance of the AI suggestion is high.
This paper addresses the problem of how to select explanations for XAI (Explainable AI)-based Intelligent Decision Support Systems (IDSSs). IDSSs have shown promise in improving user decisions through XAI-generated explanations along with AI predictions. As the development of XAI made various explanations available, we believe that IDSSs can be greatly improved if they can strategically select explanations that guide users to better decisions. This paper proposes X-Selector, a method for dynamically selecting explanations. X-Selector aims to guide users to better decisions by predicting the impact of different combinations of explanations on user decisions. We compared X-Selector's performance with two naive strategies (all possible explanations and explanations only for the most likely prediction) and two baselines (no explanation and no AI support). The results suggest the potential of X-Selector to guide users to recommended decisions and improve the performance when AI accuracy is high and a challenge when it is low.
For effective collaboration between humans and intelligent agents that employ machine learning for decision-making, humans must understand what agents can and cannot do to avoid over/under-reliance. A solution to this problem is adjusting human reliance through communication using reliance calibration cues (RCCs) to help humans assess agents' capabilities. Previous studies typically attempted to calibrate reliance by continuously presenting RCCs, and when an agent should provide RCCs remains an open question. To answer this, we propose Pred-RC, a method for selectively providing RCCs. Pred-RC uses a cognitive reliance model to predict whether a human will assign a task to an agent. By comparing the prediction results for both cases with and without an RCC, Pred-RC evaluates the influence of the RCC on human reliance. We tested Pred-RC in a human-AI collaboration task and found that it can successfully calibrate human reliance with a reduced number of RCCs.
This study investigated how wait time influences trust in and reliance on a robot. Experiment 1 was conducted as an online experiment manipulating the wait time for the task partner's action from 1 to 20 seconds and the anthropomorphism of the partner. As a result, the anthropomorphism influenced trust in the partner and did not influence reliance on the partner. However, the wait time negatively influenced trust in and reliance on the partner. Moreover, a mediation effect of trust from the wait time on reliance on the partner was confirmed. Experiment 2 was conducted to confirm the effects of wait time on trust and reliance in a human-robot face-to-face situation. As a result, the same effects of wait time found in Experiment 1 were confirmed. This study revealed that wait time is a strong and controllable factor that influences trust in and reliance on a robot.
We present an AI-assisted search tool, the "Design Concept Exploration Graph" ("D-Graph"). It assists automotive designers in creating an original design-concept phrase, that is, a combination of two adjectives that conveys product aesthetics. D-Graph retrieves adjectives from a ConceptNet knowledge graph as nodes and visualizes them in a dynamically scalable 3D graph as users explore words. The retrieval algorithm helps in finding unique words by ruling out overused words on the basis of word frequency from a large text corpus and words that are too similar between the two in a combination using the cosine similarity from ConceptNet Numberbatch word embeddings. Our experiment with participants in the automotive design field that used both the proposed D-Graph and a baseline tool for design-concept-phrase creation tasks suggested a positive difference in participants' self-evaluation on the phrases they created, though not significant. Experts' evaluations on the phrases did not show significant differences. Negative correlations between the cosine similarity of the two words in a design-concept phrase and the experts' evaluation were significant. Our qualitative analysis suggested the directions for further development of the tool that should help users in adhering to the strategy of creating compound phrases supported by computational linguistic principles.
The human-agent team, which is a problem in which humans and autonomous agents collaborate to achieve one task, is typical in human-AI collaboration. For effective collaboration, humans want to have an effective plan, but in realistic situations, they might have difficulty calculating the best plan due to cognitive limitations. In this case, guidance from an agent that has many computational resources may be useful. However, if an agent guides the human behavior explicitly, the human may feel that they have lost autonomy and are being controlled by the agent. We therefore investigated implicit guidance offered by means of an agent's behavior. With this type of guidance, the agent acts in a way that makes it easy for the human to find an effective plan for a collaborative task, and the human can then improve the plan. Since the human improves their plan voluntarily, he or she maintains autonomy. We modeled a collaborative agent with implicit guidance by integrating the Bayesian Theory of Mind into existing collaborative-planning algorithms and demonstrated through a behavioral experiment that implicit guidance is effective for enabling humans to maintain a balance between improving their plans and retaining autonomy.
Nowadays, a community starts to find the need for human presence in an alternative way, there has been tremendous research and development in advancing telepresence robots. People tend to feel closer and more comfortable with telepresence robots as many senses a human presence in robots. In general, many people feel the sense of agency from the face of a robot, but some telepresence robots without arm and body motions tend to give a sense of human presence. It is important to identify and configure how the telepresence robots affect a sense of presence and agency to people by including human face and slight face and arm motions. Therefore, we carried out extensive research via web-based experiment to determine the prototype that can result in soothing human interaction with the robot. The experiments featured videos of a telepresence robot n = 128, 2 x 2 between-participant study robot face factor: video-conference, robot-like face; arm motion factor: moving vs. static) to investigate the factors significantly affecting human presence and agency with the robot. We used two telepresence robots: an affordable robot platform and a modified version for human interaction enhancements. The findings suggest that participants feel agency that is closer to human-likeness when the robot's face was replaced with a human's face and without a motion. The robot's motion invokes a feeling of human presence whether the face is human or robot-like.
Reinforcement learning, which acquires a policy maximizing long-term rewards, has been actively studied. Unfortunately, this learning type is too slow and difficult to use in practical situations because the state-action space becomes huge in real environments. Many studies have incorporated human knowledge into reinforcement Learning. Though human knowledge on trajectories is often used, a human could be asked to control an AI agent, which can be difficult. Knowledge on subgoals may lessen this requirement because humans need only to consider a few representative states on an optimal trajectory in their minds. The essential factor for learning efficiency is rewards. Potential-based reward shaping is a basic method for enriching rewards. However, it is often difficult to incorporate subgoals for accelerating learning over potential-based reward shaping. This is because the appropriate potentials are not intuitive for humans. We extend potential-based reward shaping and propose a subgoal-based reward shaping. The method makes it easier for human trainers to share their knowledge of subgoals. To evaluate our method, we obtained a subgoal series from participants and conducted experiments in three domains, four-rooms(discrete states and discrete actions), pinball(continuous and discrete), and picking(both continuous). We compared our method with a baseline reinforcement learning algorithm and other subgoal-based methods, including random subgoal and naive subgoal-based reward shaping. As a result, we found out that our reward shaping outperformed all other methods in learning efficiency.
Social navigation has been gaining attentions with the growth in machine intelligence. Since reinforcement learning can select an action in the prediction phase at a low computational cost, it has been formulated in a social navigation tasks. However, reinforcement learning takes an enormous number of iterations until acquiring a behavior policy in the learning phase. This negatively affects the learning of robot behaviors in the real world. In particular, social navigation includes humans who are unpredictable moving obstacles in an environment. We proposed a reward shaping method with subgoals to accelerate learning. The main part is an aggregation method that use subgoals to shape a reinforcement learning algorithm. We performed a learning experiment with a social navigation task in which a robot avoided collisions and then reached its goal. The experimental results show that our method improved the learning efficiency from a base algorithm in the task.
Reinforcement learning, which acquires a policy maximizing long-term rewards, has been actively studied. Unfortunately, this learning type is too slow and difficult to use in practical situations because the state-action space becomes huge in real environments. The essential factor for learning efficiency is rewards. Potential-based reward shaping is a basic method for enriching rewards. This method is required to define a specific real-value function called a potential function for every domain. It is often difficult to represent the potential function directly. SARSA-RS learns the potential function and acquires it. However, SARSA-RS can only be applied to the simple environment. The bottleneck of this method is the aggregation of states to make abstract states since it is almost impossible for designers to build an aggregation function for all states. We propose a trajectory aggregation that uses subgoal series. This method dynamically aggregates states in an episode during trial and error with only the subgoal series and subgoal identification function. It makes designer effort minimal and the application to environments with high-dimensional observations possible. We obtained subgoal series from participants for experiments. We conducted the experiments in three domains, four-rooms(discrete states and discrete actions), pinball(continuous and discrete), and picking(both continuous). We compared our method with a baseline reinforcement learning algorithm and other subgoal-based methods, including random subgoal and naive subgoal-based reward shaping. As a result, our reward shaping outperformed all other methods in learning efficiency.