Deep reinforcement learning (DRL) has seen remarkable success in the control of single robots. However, applying DRL to robot swarms presents significant challenges. A critical challenge is non-stationarity, which occurs when two or more robots update individual or shared policies concurrently, thereby engaging in an interdependent training process with no guarantees of convergence. Circumventing non-stationarity typically involves training the robots with global information about other agents' states and/or actions. In contrast, in this paper we explore how to remove the need for global information. We pose our problem as a Partially Observable Markov Decision Process, due to the absence of global knowledge on other agents. Using collective transport as a testbed scenario, we study two approaches to multi-agent training. In the first, the robots exchange no messages, and are trained to rely on implicit communication through push-and-pull on the object to transport. In the second approach, we introduce Global State Prediction (GSP), a network trained to forma a belief over the swarm as a whole and predict its future states. We provide a comprehensive study over four well-known deep reinforcement learning algorithms in environments with obstacles, measuring performance as the successful transport of the object to the goal within a desired time-frame. Through an ablation study, we show that including GSP boosts performance and increases robustness when compared with methods that use global knowledge.
We present an approach to task scheduling in heterogeneous multi-robot systems. In our setting, the tasks to complete require diverse skills. We assume that each robot is multi-skilled, i.e., each robot offers a subset of the possible skills. This makes the formation of heterogeneous teams (\emph{coalitions}) a requirement for task completion. We present two centralized algorithms to schedule robots across tasks and to form suitable coalitions, assuming stochastic travel times across tasks. The coalitions are dynamic, in that the robots form and disband coalitions as the schedule is executed. The first algorithm we propose guarantees optimality, but its run-time is acceptable only for small problem instances. The second algorithm we propose can tackle large problems with short run-times, and is based on a heuristic approach that typically reaches 1x-2x of the optimal solution cost.
Collective perception is a foundational problem in swarm robotics, in which the swarm must reach consensus on a coherent representation of the environment. An important variant of collective perception casts it as a best-of-$n$ decision-making process, in which the swarm must identify the most likely representation out of a set of alternatives. Past work on this variant primarily focused on characterizing how different algorithms navigate the speed-vs-accuracy tradeoff in a scenario where the swarm must decide on the most frequent environmental feature. Crucially, past work on best-of-$n$ decision-making assumes the robot sensors to be perfect (noise- and fault-less), limiting the real-world applicability of these algorithms. In this paper, we derive from first principles an optimal, probabilistic framework for minimalistic swarm robots equipped with flawed sensors. Then, we validate our approach in a scenario where the swarm collectively decides the frequency of a certain environmental feature. We study the speed and accuracy of the decision-making process with respect to several parameters of interest. Our approach can provide timely and accurate frequency estimates even in presence of severe sensory noise.
Collective behaviors are typically hard to model. The scale of the swarm, the large number of interactions, and the richness and complexity of the behaviors are factors that make it difficult to distill a collective behavior into simple symbolic expressions. In this paper, we propose a novel approach to symbolic regression designed to facilitate such modeling. Using raw and post-processed data as an input, our approach produces viable symbolic expressions that closely model the target behavior. Our approach is composed of two phases. In the first, a graph neural network (GNN) is trained to extract an approximation of the target behavior. In the second phase, the GNN is used to produce data for a nested evolutionary algorithm called macro-micro evolution (MME). The macro layer of this algorithm selects candidate symbolic expressions, while the micro layer tunes its parameters. Experimental evaluation shows that our approach outperforms competing solutions for symbolic regression, making it possible to extract compact expressions for complex swarm behaviors.
The aim of this paper is to study how to apply deep reinforcement learning for the control of aggregates of minimalistic robots. We define aggregates as groups of robots with a physical connection that compels them to form a specified shape. In our case, the robots are pre-attached to an object that must be collectively transported to a known location. Minimalism, in our setting, stems from the barebone capabilities we assume: The robots can sense the target location and the immediate obstacles, but lack the means to communicate explicitly through, e.g., message-passing. In our setting, communication is implicit, i.e., mediated by aggregated push-and-pull on the object exerted by each robot. We analyze the ability to reach coordinated behavior of four well-known algorithms for deep reinforcement learning (DQN, DDQN, DDPG, and TD3). Our experiments include robot failures and different types of environmental obstacles. We compare the performance of the best control strategies found, highlighting strengths and weaknesses of each of the considered training algorithms.
The demining of landmines using drones is challenging; air-releasable payloads are typically non-intelligent (e.g., water balloons or explosives) and deploying them at even low altitudes (~6 meter) is inherently inaccurate due to complex deployment trajectories and constrained visual awareness by the drone pilot. Soft robotics offers a unique approach for aerial demining, namely due to the robust, low-cost, and lightweight designs of soft robots. Instead of non-intelligent payloads, here, we propose the use of air-releasable soft robots for demining. We developed a full system consisting of an unmanned aerial vehicle retrofitted to a soft robot carrier including a custom-made deployment mechanism, and an air-releasable, lightweight (296 g), untethered soft hybrid robot with integrated electronics that incorporates a new type of a vacuum-based flasher roller actuator system. We demonstrate a deployment cycle in which the drone drops the soft robotic hybrid from an altitude of 4.5 m meters and after which the robot approaches a dummy landmine. By deploying soft robots at points of interest, we can transition soft robotic technologies from the laboratory to real-world environments.
In this paper, we investigate how to design an effective interface for remote multi-human multi-robot interaction. While significant research exists on interfaces for individual human operators, little research exists for the multi-human case. Yet, this is a critical problem to solve to make complex, large-scale missions achievable in which direct human involvement is impossible or undesirable, and robot swarms act as a semi-autonomous agents. This paper's contribution is twofold. The first contribution is an exploration of the design space of computer-based interfaces for multi-human multi-robot operations. In particular, we focus on information transparency and on the factors that affect inter-human communication in ideal conditions, i.e., without communication issues. Our second contribution concerns the same problem, but considering increasing degrees of information loss, defined as intermittent reception of data with noticeable gaps between individual receipts. We derived a set of design recommendations based on two user studies involving 48 participants.
How can multiple humans interact with multiple robots? The goal of our research is to create an effective interface that allows multiple operators to collaboratively control teams of robots in complex tasks. In this paper, we focus on a key aspect that affects our exploration of the design space of human-robot interfaces -- inter-human communication. More specifically, we study the impact of direct and indirect communication on several metrics, such as awareness, workload, trust, and interface usability. In our experiments, the participants can engage directly through verbal communication, or indirectly by representing their actions and intentions through our interface. We report the results of a user study based on a collective transport task involving 18 human subjects and 9 robots. Our study suggests that combining both direct and indirect communication is the best approach for effective multi-human / multi-robot interaction.
Transparency is a key factor in improving the performance of human-robot interaction. A transparent interface allows humans to be aware of the state of a robot and to assess the progress of the tasks at hand. When multi-robot systems are involved, transparency is an even greater challenge, due to the larger number of variables affecting the behavior of the robots as a whole. Significant effort has been devoted to studying transparency when single operators interact with multiple robots. However, studies on transparency that focus on multiple human operators interacting with a multi-robot systems are limited. This paper aims to fill this gap by presenting a human-swarm interaction interface with graphical elements that can be enabled and disabled. Through this interface, we study which graphical elements are contribute to transparency by comparing four "transparency modes": (i) no transparency (no operator receives information from the robots), (ii) central transparency (the operators receive information only relevant to their personal task), (iii) peripheral transparency (the operators share information on each others' tasks), and (iv) mixed transparency (both central and peripheral). We report the results in terms of awareness, trust, and workload of a user study involving 18 participants engaged in a complex multi-robot task.
In this paper, we propose an approach to the distributed storage and fusion of data for collective perception in resource-limited robot swarms. We demonstrate our approach in a distributed semantic classification scenario. We consider a team of mobile robots, in which each robot runs a pre-trained classifier of known accuracy to annotate objects in the environment. We provide two main contributions: (i) a decentralized, shared data structure for efficient storage and retrieval of the semantic annotations, specifically designed for low-resource mobile robots; and (ii) a voting-based, decentralized algorithm to reduce the variance of the calculated annotations in presence of imperfect classification. We discuss theory and implementation of both contributions, and perform an extensive set of realistic simulated experiments to evaluate the performance of our approach.