Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jan Peters

ROS-LLM: A ROS framework for embodied AI with task feedback and structured reasoning

Jun 28, 2024

Christopher E. Mower, Yuhui Wan, Hongzhan Yu, Antoine Grosnit, Jonas Gonzalez-Billandon, Matthieu Zimmer, Jinlong Wang, Xinyu Zhang, Yao Zhao, Anbang Zhai(+12 more)

Figure 1 for ROS-LLM: A ROS framework for embodied AI with task feedback and structured reasoning

Figure 2 for ROS-LLM: A ROS framework for embodied AI with task feedback and structured reasoning

Figure 3 for ROS-LLM: A ROS framework for embodied AI with task feedback and structured reasoning

Figure 4 for ROS-LLM: A ROS framework for embodied AI with task feedback and structured reasoning

Abstract:We present a framework for intuitive robot programming by non-experts, leveraging natural language prompts and contextual information from the Robot Operating System (ROS). Our system integrates large language models (LLMs), enabling non-experts to articulate task requirements to the system through a chat interface. Key features of the framework include: integration of ROS with an AI agent connected to a plethora of open-source and commercial LLMs, automatic extraction of a behavior from the LLM output and execution of ROS actions/services, support for three behavior modes (sequence, behavior tree, state machine), imitation learning for adding new robot actions to the library of possible actions, and LLM reflection via human and environment feedback. Extensive experiments validate the framework, showcasing robustness, scalability, and versatility in diverse scenarios, including long-horizon tasks, tabletop rearrangements, and remote supervisory control. To facilitate the adoption of our framework and support the reproduction of our results, we have made our code open-source. You can access it at: https://github.com/huawei-noah/HEBO/tree/master/ROSLLM.

* This document contains 26 pages and 13 figures

Via

Access Paper or Ask Questions

Adaptive $Q$-Network: On-the-fly Target Selection for Deep Reinforcement Learning

May 25, 2024

Théo Vincent, Fabian Wahren, Jan Peters, Boris Belousov, Carlo D'Eramo

Figure 1 for Adaptive $Q$-Network: On-the-fly Target Selection for Deep Reinforcement Learning

Figure 2 for Adaptive $Q$-Network: On-the-fly Target Selection for Deep Reinforcement Learning

Figure 3 for Adaptive $Q$-Network: On-the-fly Target Selection for Deep Reinforcement Learning

Figure 4 for Adaptive $Q$-Network: On-the-fly Target Selection for Deep Reinforcement Learning

Abstract:Deep Reinforcement Learning (RL) is well known for being highly sensitive to hyperparameters, requiring practitioners substantial efforts to optimize them for the problem at hand. In recent years, the field of automated Reinforcement Learning (AutoRL) has grown in popularity by trying to address this issue. However, these approaches typically hinge on additional samples to select well-performing hyperparameters, hindering sample-efficiency and practicality in RL. Furthermore, most AutoRL methods are heavily based on already existing AutoML methods, which were originally developed neglecting the additional challenges inherent to RL due to its non-stationarities. In this work, we propose a new approach for AutoRL, called Adaptive $Q$-Network (AdaQN), that is tailored to RL to take into account the non-stationarity of the optimization procedure without requiring additional samples. AdaQN learns several $Q$-functions, each one trained with different hyperparameters, which are updated online using the $Q$-function with the smallest approximation error as a shared target. Our selection scheme simultaneously handles different hyperparameters while coping with the non-stationarity induced by the RL optimization procedure and being orthogonal to any critic-based RL algorithm. We demonstrate that AdaQN is theoretically sound and empirically validate it in MuJoCo control problems, showing benefits in sample-efficiency, overall performance, training stability, and robustness to stochasticity.

* Preprint

Via

Access Paper or Ask Questions

Learning Tactile Insertion in the Real World

May 01, 2024

Daniel Palenicek, Theo Gruner, Tim Schneider, Alina Böhm, Janis Lenz, Inga Pfenning, Eric Krämer, Jan Peters

Figure 1 for Learning Tactile Insertion in the Real World

Figure 2 for Learning Tactile Insertion in the Real World

Figure 3 for Learning Tactile Insertion in the Real World

Abstract:Humans have exceptional tactile sensing capabilities, which they can leverage to solve challenging, partially observable tasks that cannot be solved from visual observation alone. Research in tactile sensing attempts to unlock this new input modality for robots. Lately, these sensors have become cheaper and, thus, widely available. At the same time, the question of how to integrate them into control loops is still an active area of research, with central challenges being partial observability and the contact-rich nature of manipulation tasks. In this study, we propose to use Reinforcement Learning to learn an end-to-end policy, mapping directly from tactile sensor readings to actions. Specifically, we use Dreamer-v3 on a challenging, partially observable robotic insertion task with a Franka Research 3, both in simulation and on a real system. For the real setup, we built a robotic platform capable of resetting itself fully autonomously, allowing for extensive training runs without human supervision. Our preliminary results indicate that Dreamer is capable of utilizing tactile inputs to solve robotic manipulation tasks in simulation and reality. Furthermore, we find that providing the robot with tactile feedback generally improves task performance, though, in our setup, we do not yet include other sensing modalities. In the future, we plan to utilize our platform to evaluate a wide range of other Reinforcement Learning algorithms on tactile tasks.

Via

Access Paper or Ask Questions

Integrating Visuo-tactile Sensing with Haptic Feedback for Teleoperated Robot Manipulation

Apr 30, 2024

Noah Becker, Erik Gattung, Kay Hansel, Tim Schneider, Yaonan Zhu, Yasuhisa Hasegawa, Jan Peters

Figure 1 for Integrating Visuo-tactile Sensing with Haptic Feedback for Teleoperated Robot Manipulation

Figure 2 for Integrating Visuo-tactile Sensing with Haptic Feedback for Teleoperated Robot Manipulation

Abstract:Telerobotics enables humans to overcome spatial constraints and allows them to physically interact with the environment in remote locations. However, the sensory feedback provided by the system to the operator is often purely visual, limiting the operator's dexterity in manipulation tasks. In this work, we address this issue by equipping the robot's end-effector with high-resolution visuotactile GelSight sensors. Using low-cost MANUS-Gloves, we provide the operator with haptic feedback about forces acting at the points of contact in the form of vibration signals. We propose two different methods for estimating these forces; one based on estimating the movement of markers on the sensor surface and one deep-learning approach. Additionally, we integrate our system into a virtual-reality teleoperation pipeline in which a human operator controls both arms of a Tiago robot while receiving visual and haptic feedback. We believe that integrating haptic feedback is a crucial step for dexterous manipulation in teleoperated robotic systems.

Via

Access Paper or Ask Questions

Clustering of Motion Trajectories by a Distance Measure Based on Semantic Features

Apr 26, 2024

Christoph Zelch, Jan Peters, Oskar von Stryk

Abstract:Clustering of motion trajectories is highly relevant for human-robot interactions as it allows the anticipation of human motions, fast reaction to those, as well as the recognition of explicit gestures. Further, it allows automated analysis of recorded motion data. Many clustering algorithms for trajectories build upon distance metrics that are based on pointwise Euclidean distances. However, our work indicates that focusing on salient characteristics is often sufficient. We present a novel distance measure for motion plans consisting of state and control trajectories that is based on a compressed representation built from their main features. This approach allows a flexible choice of feature classes relevant to the respective task. The distance measure is used in agglomerative hierarchical clustering. We compare our method with the widely used dynamic time warping algorithm on test sets of motion plans for the Furuta pendulum and the Manutec robot arm and on real-world data from a human motion dataset. The proposed method demonstrates slight advantages in clustering and strong advantages in runtime, especially for long trajectories.

* 2023 IEEE-RAS 22nd International Conference on Humanoid Robots (Humanoids), Austin, TX, USA, 2023
* Published in: 2023 IEEE-RAS 22nd International Conference on Humanoid Robots (Humanoids). Code available at: https://github.com/cztuda/semantic-feature-clustering

Via

Access Paper or Ask Questions

Safe Reinforcement Learning on the Constraint Manifold: Theory and Applications

Apr 13, 2024

Puze Liu, Haitham Bou-Ammar, Jan Peters, Davide Tateo

Figure 1 for Safe Reinforcement Learning on the Constraint Manifold: Theory and Applications

Figure 2 for Safe Reinforcement Learning on the Constraint Manifold: Theory and Applications

Figure 3 for Safe Reinforcement Learning on the Constraint Manifold: Theory and Applications

Figure 4 for Safe Reinforcement Learning on the Constraint Manifold: Theory and Applications

Abstract:Integrating learning-based techniques, especially reinforcement learning, into robotics is promising for solving complex problems in unstructured environments. However, most existing approaches are trained in well-tuned simulators and subsequently deployed on real robots without online fine-tuning. In this setting, the simulation's realism seriously impacts the deployment's success rate. Instead, learning with real-world interaction data offers a promising alternative: not only eliminates the need for a fine-tuned simulator but also applies to a broader range of tasks where accurate modeling is unfeasible. One major problem for on-robot reinforcement learning is ensuring safety, as uncontrolled exploration can cause catastrophic damage to the robot or the environment. Indeed, safety specifications, often represented as constraints, can be complex and non-linear, making safety challenging to guarantee in learning systems. In this paper, we show how we can impose complex safety constraints on learning-based robotics systems in a principled manner, both from theoretical and practical points of view. Our approach is based on the concept of the Constraint Manifold, representing the set of safe robot configurations. Exploiting differential geometry techniques, i.e., the tangent space, we can construct a safe action space, allowing learning agents to sample arbitrary actions while ensuring safety. We demonstrate the method's effectiveness in a real-world Robot Air Hockey task, showing that our method can handle high-dimensional tasks with complex constraints. Videos of the real robot experiments are available on the project website (https://puzeliu.github.io/TRO-ATACOM).

* 19 pages; sumitted to IEEE Transactions on Robotics

Via

Access Paper or Ask Questions

What Matters for Active Texture Recognition With Vision-Based Tactile Sensors

Mar 20, 2024

Alina Böhm, Tim Schneider, Boris Belousov, Alap Kshirsagar, Lisa Lin, Katja Doerschner, Knut Drewing, Constantin A. Rothkopf, Jan Peters

Abstract:This paper explores active sensing strategies that employ vision-based tactile sensors for robotic perception and classification of fabric textures. We formalize the active sampling problem in the context of tactile fabric recognition and provide an implementation of information-theoretic exploration strategies based on minimizing predictive entropy and variance of probabilistic models. Through ablation studies and human experiments, we investigate which components are crucial for quick and reliable texture recognition. Along with the active sampling strategies, we evaluate neural network architectures, representations of uncertainty, influence of data augmentation, and dataset variability. By evaluating our method on a previously published Active Clothing Perception Dataset and on a real robotic system, we establish that the choice of the active exploration strategy has only a minor influence on the recognition accuracy, whereas data augmentation and dropout rate play a significantly larger role. In a comparison study, while humans achieve 66.9% recognition accuracy, our best approach reaches 90.0% in under 5 touches, highlighting that vision-based tactile sensors are highly effective for fabric texture recognition.

* 7 pages, 9 figures, accepted at 2024 IEEE International Conference on Robotics and Automation (ICRA)

Via

Access Paper or Ask Questions

Iterated $Q$-Network: Beyond the One-Step Bellman Operator

Mar 04, 2024

Théo Vincent, Daniel Palenicek, Boris Belousov, Jan Peters, Carlo D'Eramo

Figure 1 for Iterated $Q$-Network: Beyond the One-Step Bellman Operator

Figure 2 for Iterated $Q$-Network: Beyond the One-Step Bellman Operator

Figure 3 for Iterated $Q$-Network: Beyond the One-Step Bellman Operator

Figure 4 for Iterated $Q$-Network: Beyond the One-Step Bellman Operator

Abstract:Value-based Reinforcement Learning (RL) methods rely on the application of the Bellman operator, which needs to be approximated from samples. Most approaches consist of an iterative scheme alternating the application of the Bellman operator and a subsequent projection step onto a considered function space. However, we observe that these algorithms can be improved by considering multiple iterations of the Bellman operator at once. Thus, we introduce iterated $Q$-Networks (iQN), a novel approach that learns a sequence of $Q$-function approximations where each $Q$-function serves as the target for the next one in a chain of consecutive Bellman iterations. We demonstrate that iQN is theoretically sound and show how it can be seamlessly used in value-based and actor-critic methods. We empirically demonstrate its advantages on Atari $2600$ games and in continuous-control MuJoCo environments.

* Preprint

Via

Access Paper or Ask Questions

Information-Theoretic Safe Bayesian Optimization

Feb 23, 2024

Alessandro G. Bottero, Carlos E. Luis, Julia Vinogradska, Felix Berkenkamp, Jan Peters

Abstract:We consider a sequential decision making task, where the goal is to optimize an unknown function without evaluating parameters that violate an a~priori unknown (safety) constraint. A common approach is to place a Gaussian process prior on the unknown functions and allow evaluations only in regions that are safe with high probability. Most current methods rely on a discretization of the domain and cannot be directly extended to the continuous case. Moreover, the way in which they exploit regularity assumptions about the constraint introduces an additional critical hyperparameter. In this paper, we propose an information-theoretic safe exploration criterion that directly exploits the GP posterior to identify the most informative safe parameters to evaluate. The combination of this exploration criterion with a well known Bayesian optimization acquisition function yields a novel safe Bayesian optimization selection criterion. Our approach is naturally applicable to continuous domains and does not require additional explicit hyperparameters. We theoretically analyze the method and show that we do not violate the safety constraint with high probability and that we learn about the value of the safe optimum up to arbitrary precision. Empirical evaluations demonstrate improved data-efficiency and scalability.

* arXiv admin note: text overlap with arXiv:2212.04914

Via

Access Paper or Ask Questions

Kinematically Constrained Human-like Bimanual Robot-to-Human Handovers

Feb 22, 2024

Yasemin Göksu, Antonio De Almeida Correia, Vignesh Prasad, Alap Kshirsagar, Dorothea Koert, Jan Peters, Georgia Chalvatzaki

Figure 1 for Kinematically Constrained Human-like Bimanual Robot-to-Human Handovers

Figure 2 for Kinematically Constrained Human-like Bimanual Robot-to-Human Handovers

Abstract:Bimanual handovers are crucial for transferring large, deformable or delicate objects. This paper proposes a framework for generating kinematically constrained human-like bimanual robot motions to ensure seamless and natural robot-to-human object handovers. We use a Hidden Semi-Markov Model (HSMM) to reactively generate suitable response trajectories for a robot based on the observed human partner's motion. The trajectories are adapted with task space constraints to ensure accurate handovers. Results from a pilot study show that our approach is perceived as more human--like compared to a baseline Inverse Kinematics approach.

* Accepted as a Late Breaking Report in The ACM/IEEE International Conference on Human Robot Interaction (HRI) 2024

Via

Access Paper or Ask Questions