This work presents an approach for control, state-estimation and learning model (hyper)parameters for robotic manipulators. It is based on the active inference framework, prominent in computational neuroscience as a theory of the brain, where behaviour arises from minimizing variational free-energy. The robotic manipulator shows adaptive and robust behaviour compared to state-of-the-art methods. Additionally, we show the exact relationship to classic methods such as PID control. Finally, we show that by learning a temporal parameter and model variances, our approach can deal with unmodelled dynamics, damps oscillations, and is robust against disturbances and poor initial parameters. The approach is validated on the `Franka Emika Panda' 7 DoF manipulator.
This work investigates Monte-Carlo planning for agents in stochastic environments, with multiple objectives. We propose the Convex Hull Monte-Carlo Tree-Search (CHMCTS) framework, which builds upon Trial Based Heuristic Tree Search and Convex Hull Value Iteration (CHVI), as a solution to multi-objective planning in large environments. Moreover, we consider how to pose the problem of approximating multiobjective planning solutions as a contextual multi-armed bandits problem, giving a principled motivation for how to select actions from the view of contextual regret. This leads us to the use of Contextual Zooming for action selection, yielding Zooming CHMCTS. We evaluate our algorithm using the Generalised Deep Sea Treasure environment, demonstrating that Zooming CHMCTS can achieve a sublinear contextual regret and scales better than CHVI on a given computational budget.
We propose novel techniques for task allocation and planning in multi-robot systems operating in uncertain environments. Task allocation is performed simultaneously with planning, which provides more detailed information about individual robot behaviour, but also exploits independence between tasks to do so efficiently. We use Markov decision processes to model robot behaviour and linear temporal logic to specify tasks and safety constraints. Building upon techniques and tools from formal verification, we show how to generate a sequence of multi-robot policies, iteratively refining them to reallocate tasks if individual robots fail, and providing probabilistic guarantees on the performance (and safe operation) of the team of robots under the resulting policy. We implement our approach and evaluate it on a benchmark multi-robot example.
Thanks to the efforts of the robotics and autonomous systems community, robots are becoming ever more capable. There is also an increasing demand from end-users for autonomous service robots that can operate in real environments for extended periods. In the STRANDS project we are tackling this demand head-on by integrating state-of-the-art artificial intelligence and robotics research into mobile service robots, and deploying these systems for long-term installations in security and care environments. Over four deployments, our robots have been operational for a combined duration of 104 days autonomously performing end-user defined tasks, covering 116km in the process. In this article we describe the approach we have used to enable long-term autonomous operation in everyday environments, and how our robots are able to use their long run times to improve their own performance.