Get our free extension to see links to code for papers anywhere online!Free extension: code links for papers anywhere!Free add-on: See code for papers anywhere!

Alexander Botros, Barry Gilhuly, Nils Wilde, Armin Sadeghi, Javier Alonso-Mora, Stephen L. Smith

We study the problem of deploying a fleet of mobile robots to service tasks that arrive stochastically over time and at random locations in an environment. This is known as the Dynamic Vehicle Routing Problem (DVRP) and requires robots to allocate incoming tasks among themselves and find an optimal sequence for each robot. State-of-the-art approaches only consider average wait times and focus on high-load scenarios where the arrival rate of tasks approaches the limit of what can be handled by the robots while keeping the queue of unserviced tasks bounded, i.e., stable. To ensure stability, these approaches repeatedly compute minimum distance tours over a set of newly arrived tasks. This paper is aimed at addressing the missing policies for moderate-load scenarios, where quality of service can be improved by prioritizing long-waiting tasks. We introduce a novel DVRP policy based on a cost function that takes the $p$-norm over accumulated wait times and show it guarantees stability even in high-load scenarios. We demonstrate that the proposed policy outperforms the state-of-the-art in both mean and $95^{th}$ percentile wait times in moderate-load scenarios through simulation experiments in the Euclidean plane as well as using real-world data for city scale service requests.

Via

Shamak Dutta, Nils Wilde, Pratap Tokekar, Stephen L. Smith

We study the sample placement and shortest tour problem for robots tasked with mapping environmental phenomena modeled as stationary random fields. The objective is to minimize the resources used (samples or tour length) while guaranteeing estimation accuracy. We give approximation algorithms for both problems in convex environments. These improve previously known results, both in terms of theoretical guarantees and in simulations. In addition, we disprove an existing claim in the literature on a lower bound for a solution to the sample placement problem.

Via

Yifan Cai, Abhinav Dahiya, Nils Wilde, Stephen L. Smith

In this paper, we consider the problem of allocating human operator assistance in a system with multiple autonomous robots. Each robot is required to complete independent missions, each defined as a sequence of tasks. While executing a task, a robot can either operate autonomously or be teleoperated by the human operator to complete the task at a faster rate. We show that the problem of creating a teleoperation schedule that minimizes makespan of the system is NP-Hard. We formulate our problem as a Mixed Integer Linear Program, which can be used to optimally solve small to moderate sized problem instances. We also develop an anytime algorithm that makes use of the problem structure to provide a fast and high-quality solution of the operator scheduling problem, even for larger problem instances. Our key insight is to identify blocking tasks in greedily-created schedules and iteratively remove those blocks to improve the quality of the solution. Through numerical simulations, we demonstrate the benefits of the proposed algorithm as an efficient and scalable approach that outperforms other greedy methods.

Via

Alexander Botros, Armin Sadeghi, Nils Wilde, Javier Alonso-Mora, Stephen L. Smith

Many problems in robotics seek to simultaneously optimize several competing objectives under constraints. A conventional approach to solving such multi-objective optimization problems is to create a single cost function comprised of the weighted sum of the individual objectives. Solutions to this scalarized optimization problem are Pareto optimal solutions to the original multi-objective problem. However, finding an accurate representation of a Pareto front remains an important challenge. Using uniformly spaced weight vectors is often inefficient and does not provide error bounds. Thus, we address the problem of computing a finite set of weight vectors such that for any other weight vector, there exists an element in the set whose error compared to optimal is minimized. To this end, we prove fundamental properties of the optimal cost as a function of the weight vector, including its continuity and concavity. Using these, we propose an algorithm that greedily adds the weight vector least-represented by the current set, and provide bounds on the error. Finally, we illustrate that the proposed approach significantly outperforms uniformly distributed weights for different robot planning problems with varying numbers of objective functions.

Via

Shamak Dutta, Nils Wilde, Stephen L. Smith

In this paper, we consider a subset selection problem in a spatial field where we seek to find a set of k locations whose observations provide the best estimate of the field value at a finite set of prediction locations. The measurements can be taken at any location in the continuous field, and the covariance between the field values at different points is given by the widely used squared exponential covariance function. One approach for observation selection is to perform a grid discretization of the space and obtain an approximate solution using the greedy algorithm. The solution quality improves with a finer grid resolution but at the cost of increased computation. We propose a method to reduce the computational complexity, or conversely to increase solution quality, of the greedy algorithm by considering a search space consisting only of prediction locations and centroids of cliques formed by the prediction locations. We demonstrate the effectiveness of our proposed approach in simulation, both in terms of solution quality and runtime.

Via

Nils Wilde, Armin Sadeghi, Stephen L. Smith

In this paper, we study the well-known team orienteering problem where a fleet of robots collects rewards by visiting locations. Usually, the rewards are assumed to be known to the robots; however, in applications such as environmental monitoring or scene reconstruction, the rewards are often subjective and specifying them is challenging. We propose a framework to learn the unknown preferences of the user by presenting alternative solutions to them, and the user provides a ranking on the proposed alternative solutions. We consider the two cases for the user: 1) a deterministic user which provides the optimal ranking for the alternative solutions, and 2) a noisy user which provides the optimal ranking according to an unknown probability distribution. For the deterministic user we propose a framework to minimize a bound on the maximum deviation from the optimal solution, namely regret. We adapt the approach to capture the noisy user and minimize the expected regret. Finally, we demonstrate the importance of learning user preferences and the performance of the proposed methods in an extensive set of experimental results using real world datasets for environmental monitoring problems.

Via

Nils Wilde, Erdem Bıyık, Dorsa Sadigh, Stephen L. Smith

Today's robots are increasingly interacting with people and need to efficiently learn inexperienced user's preferences. A common framework is to iteratively query the user about which of two presented robot trajectories they prefer. While this minimizes the users effort, a strict choice does not yield any information on how much one trajectory is preferred. We propose scale feedback, where the user utilizes a slider to give more nuanced information. We introduce a probabilistic model on how users would provide feedback and derive a learning framework for the robot. We demonstrate the performance benefit of slider feedback in simulations, and validate our approach in two user studies suggesting that scale feedback enables more effective learning in practice.

Via

Nils Wilde, Dana Kulic, Stephen L. Smith

We study active preference learning as a framework for intuitively specifying the behaviour of autonomous robots. In active preference learning, a user chooses the preferred behaviour from a set of alternatives, from which the robot learns the user's preferences, modeled as a parameterized cost function. Previous approaches present users with alternatives that minimize the uncertainty over the parameters of the cost function. However, different parameters might lead to the same optimal behaviour; as a consequence the solution space is more structured than the parameter space. We exploit this by proposing a query selection that greedily reduces the maximum error ratio over the solution space. In simulations we demonstrate that the proposed approach outperforms other state of the art techniques in both learning efficiency and ease of queries for the user. Finally, we show that evaluating the learning based on the similarities of solutions instead of the similarities of weights allows for better predictions for different scenarios.

Via

Nils Wilde, Alexandru Blidaru, Stephen L. Smith, Dana Kulić

An important challenge in human robot interaction (HRI) is enabling non-expert users to specify complex tasks for autonomous robots. Recently, active preference learning has been applied in HRI to interactively shape a robot's behaviour. We study a framework where users specify constraints on allowable robot movements on a graphical interface, yielding a robot task specification. However, users may not be able to accurately assess the impact of such constraints on the performance of a robot. Thus, we revise the specification by iteratively presenting users with alternative solutions where some constraints might be violated, and learn about the importance of the constraints from the users' choices between these alternatives. We demonstrate our framework in a user study with a material transport task in an industrial facility. We show that nearly all users accept alternative solutions and thus obtain a revised specification through the learning process, and that the revision leads to a substantial improvement in robot performance. Further, the learning process reduces the variances between the specifications from different users and thus makes the specifications more similar. As a result, the users whose initial specifications had the largest impact on performance benefit the most from the interactive learning.

Via

Nils Wilde, Dana Kulic, Stephen L. Smith

Specifying complex task behaviours while ensuring good robot performance may be difficult for untrained users. We study a framework for users to specify rules for acceptable behaviour in a shared environment such as industrial facilities. As non-expert users might have little intuition about how their specification impacts the robot's performance, we design a learning system that interacts with the user to find an optimal solution. Using active preference learning, we iteratively show alternative paths that the robot could take on an interface. From the user feedback ranking the alternatives, we learn about the weights that users place on each part of their specification. We extend the user model from our previous work to a discrete Bayesian learning model and introduce a greedy algorithm for proposing alternative that operates on the notion of equivalence regions of user weights. We prove that with this algorithm the revision active learning process converges on the user-optimal path. In simulations on realistic industrial environments, we demonstrate the convergence and robustness of our approach.

Via