Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Marco Hutter

Data-Efficient Task Generalization via Probabilistic Model-based Meta Reinforcement Learning

Nov 13, 2023

Arjun Bhardwaj, Jonas Rothfuss, Bhavya Sukhija, Yarden As, Marco Hutter, Stelian Coros, Andreas Krause

Figure 1 for Data-Efficient Task Generalization via Probabilistic Model-based Meta Reinforcement Learning

Figure 2 for Data-Efficient Task Generalization via Probabilistic Model-based Meta Reinforcement Learning

Figure 3 for Data-Efficient Task Generalization via Probabilistic Model-based Meta Reinforcement Learning

Figure 4 for Data-Efficient Task Generalization via Probabilistic Model-based Meta Reinforcement Learning

Abstract:We introduce PACOH-RL, a novel model-based Meta-Reinforcement Learning (Meta-RL) algorithm designed to efficiently adapt control policies to changing dynamics. PACOH-RL meta-learns priors for the dynamics model, allowing swift adaptation to new dynamics with minimal interaction data. Existing Meta-RL methods require abundant meta-learning data, limiting their applicability in settings such as robotics, where data is costly to obtain. To address this, PACOH-RL incorporates regularization and epistemic uncertainty quantification in both the meta-learning and task adaptation stages. When facing new dynamics, we use these uncertainty estimates to effectively guide exploration and data collection. Overall, this enables positive transfer, even when access to data from prior tasks or dynamic settings is severely limited. Our experiment results demonstrate that PACOH-RL outperforms model-based RL and model-based Meta-RL baselines in adapting to new dynamic conditions. Finally, on a real robotic car, we showcase the potential for efficient RL policy adaptation in diverse, data-scarce conditions.

Via

Access Paper or Ask Questions

Resilient Legged Local Navigation: Learning to Traverse with Compromised Perception End-to-End

Oct 05, 2023

Jin Jin, Chong Zhang, Jonas Frey, Nikita Rudin, Matias Mattamala, Cesar Cadena, Marco Hutter

Figure 1 for Resilient Legged Local Navigation: Learning to Traverse with Compromised Perception End-to-End

Figure 2 for Resilient Legged Local Navigation: Learning to Traverse with Compromised Perception End-to-End

Figure 3 for Resilient Legged Local Navigation: Learning to Traverse with Compromised Perception End-to-End

Figure 4 for Resilient Legged Local Navigation: Learning to Traverse with Compromised Perception End-to-End

Abstract:Autonomous robots must navigate reliably in unknown environments even under compromised exteroceptive perception, or perception failures. Such failures often occur when harsh environments lead to degraded sensing, or when the perception algorithm misinterprets the scene due to limited generalization. In this paper, we model perception failures as invisible obstacles and pits, and train a reinforcement learning (RL) based local navigation policy to guide our legged robot. Unlike previous works relying on heuristics and anomaly detection to update navigational information, we train our navigation policy to reconstruct the environment information in the latent space from corrupted perception and react to perception failures end-to-end. To this end, we incorporate both proprioception and exteroception into our policy inputs, thereby enabling the policy to sense collisions on different body parts and pits, prompting corresponding reactions. We validate our approach in simulation and on the real quadruped robot ANYmal running in real-time (<10 ms CPU inference). In a quantitative comparison with existing heuristic-based locally reactive planners, our policy increases the success rate over 30% when facing perception failures. Project Page: https://bit.ly/45NBTuh.

* Website and videos are available at our Project Page: https://bit.ly/45NBTuh

Via

Access Paper or Ask Questions

ViPlanner: Visual Semantic Imperative Learning for Local Navigation

Oct 02, 2023

Pascal Roth, Julian Nubert, Fan Yang, Mayank Mittal, Marco Hutter

Figure 1 for ViPlanner: Visual Semantic Imperative Learning for Local Navigation

Figure 2 for ViPlanner: Visual Semantic Imperative Learning for Local Navigation

Figure 3 for ViPlanner: Visual Semantic Imperative Learning for Local Navigation

Figure 4 for ViPlanner: Visual Semantic Imperative Learning for Local Navigation

Abstract:Real-time path planning in outdoor environments still challenges modern robotic systems due to differences in terrain traversability, diverse obstacles, and the necessity for fast decision-making. Established approaches have primarily focused on geometric navigation solutions, which work well for structured geometric obstacles but have limitations regarding the semantic interpretation of different terrain types and their affordances. Moreover, these methods fail to identify traversable geometric occurrences, such as stairs. To overcome these issues, we introduce ViPlanner, a learned local path planning approach that generates local plans based on geometric and semantic information. The system is trained using the Imperative Learning paradigm, for which the network weights are optimized end-to-end based on the planning task objective. This optimization uses a differentiable formulation of a semantic costmap, which enables the planner to distinguish between the traversability of different terrains and accurately identify obstacles. The semantic information is represented in 30 classes using an RGB colorspace that can effectively encode the multiple levels of traversability. We show that the planner can adapt to diverse real-world environments without requiring any real-world training. In fact, the planner is trained purely in simulation, enabling a highly scalable training data generation. Experimental results demonstrate resistance to noise, zero-shot sim-to-real transfer, and a decrease of 38.02% in terms of traversability cost compared to purely geometric-based approaches. Code and models are made publicly available: https://github.com/leggedrobotics/viplanner.

Via

Access Paper or Ask Questions

MEM: Multi-Modal Elevation Mapping for Robotics and Learning

Sep 28, 2023

Gian Erni, Jonas Frey, Takahiro Miki, Matias Mattamala, Marco Hutter

Figure 1 for MEM: Multi-Modal Elevation Mapping for Robotics and Learning

Figure 2 for MEM: Multi-Modal Elevation Mapping for Robotics and Learning

Figure 3 for MEM: Multi-Modal Elevation Mapping for Robotics and Learning

Figure 4 for MEM: Multi-Modal Elevation Mapping for Robotics and Learning

Abstract:Elevation maps are commonly used to represent the environment of mobile robots and are instrumental for locomotion and navigation tasks. However, pure geometric information is insufficient for many field applications that require appearance or semantic information, which limits their applicability to other platforms or domains. In this work, we extend a 2.5D robot-centric elevation mapping framework by fusing multi-modal information from multiple sources into a popular map representation. The framework allows inputting data contained in point clouds or images in a unified manner. To manage the different nature of the data, we also present a set of fusion algorithms that can be selected based on the information type and user requirements. Our system is designed to run on the GPU, making it real-time capable for various robotic and learning tasks. We demonstrate the capabilities of our framework by deploying it on multiple robots with varying sensor configurations and showcasing a range of applications that utilize multi-modal layers, including line detection, human detection, and colorization.

* Accapted for IROS2023. This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

Via

Access Paper or Ask Questions

Evaluation of Constrained Reinforcement Learning Algorithms for Legged Locomotion

Sep 27, 2023

Joonho Lee, Lukas Schroth, Victor Klemm, Marko Bjelonic, Alexander Reske, Marco Hutter

Figure 1 for Evaluation of Constrained Reinforcement Learning Algorithms for Legged Locomotion

Figure 2 for Evaluation of Constrained Reinforcement Learning Algorithms for Legged Locomotion

Figure 3 for Evaluation of Constrained Reinforcement Learning Algorithms for Legged Locomotion

Figure 4 for Evaluation of Constrained Reinforcement Learning Algorithms for Legged Locomotion

Abstract:Shifting from traditional control strategies to Deep Reinforcement Learning (RL) for legged robots poses inherent challenges, especially when addressing real-world physical constraints during training. While high-fidelity simulations provide significant benefits, they often bypass these essential physical limitations. In this paper, we experiment with the Constrained Markov Decision Process (CMDP) framework instead of the conventional unconstrained RL for robotic applications. We perform a comparative study of different constrained policy optimization algorithms to identify suitable methods for practical implementation. Our robot experiments demonstrate the critical role of incorporating physical constraints, yielding successful sim-to-real transfers, and reducing operational errors on physical systems. The CMDP formulation streamlines the training process by separately handling constraints from rewards. Our findings underscore the potential of constrained RL for the effective development and deployment of learned controllers in robotics.

Via

Access Paper or Ask Questions

DTC: Deep Tracking Control -- A Unifying Approach to Model-Based Planning and Reinforcement-Learning for Versatile and Robust Locomotion

Sep 27, 2023

Fabian Jenelten, Junzhe He, Farbod Farshidian, Marco Hutter

Abstract:Legged locomotion is a complex control problem that requires both accuracy and robustness to cope with real-world challenges. Legged systems have traditionally been controlled using trajectory optimization with inverse dynamics. Such hierarchical model-based methods are appealing due to intuitive cost function tuning, accurate planning, and most importantly, the insightful understanding gained from more than one decade of extensive research. However, model mismatch and violation of assumptions are common sources of faulty operation and may hinder successful sim-to-real transfer. Simulation-based reinforcement learning, on the other hand, results in locomotion policies with unprecedented robustness and recovery skills. Yet, all learning algorithms struggle with sparse rewards emerging from environments where valid footholds are rare, such as gaps or stepping stones. In this work, we propose a hybrid control architecture that combines the advantages of both worlds to simultaneously achieve greater robustness, foot-placement accuracy, and terrain generalization. Our approach utilizes a model-based planner to roll out a reference motion during training. A deep neural network policy is trained in simulation, aiming to track the optimized footholds. We evaluate the accuracy of our locomotion pipeline on sparse terrains, where pure data-driven methods are prone to fail. Furthermore, we demonstrate superior robustness in the presence of slippery or deformable ground when compared to model-based counterparts. Finally, we show that our proposed tracking controller generalizes across different trajectory optimization methods not seen during training. In conclusion, our work unites the predictive capabilities and optimality guarantees of online planning with the inherent robustness attributed to offline learning.

Via

Access Paper or Ask Questions

Learning Risk-Aware Quadrupedal Locomotion using Distributional Reinforcement Learning

Sep 25, 2023

Lukas Schneider, Jonas Frey, Takahiro Miki, Marco Hutter

Figure 1 for Learning Risk-Aware Quadrupedal Locomotion using Distributional Reinforcement Learning

Figure 2 for Learning Risk-Aware Quadrupedal Locomotion using Distributional Reinforcement Learning

Figure 3 for Learning Risk-Aware Quadrupedal Locomotion using Distributional Reinforcement Learning

Figure 4 for Learning Risk-Aware Quadrupedal Locomotion using Distributional Reinforcement Learning

Abstract:Deployment in hazardous environments requires robots to understand the risks associated with their actions and movements to prevent accidents. Despite its importance, these risks are not explicitly modeled by currently deployed locomotion controllers for legged robots. In this work, we propose a risk sensitive locomotion training method employing distributional reinforcement learning to consider safety explicitly. Instead of relying on a value expectation, we estimate the complete value distribution to account for uncertainty in the robot's interaction with the environment. The value distribution is consumed by a risk metric to extract risk sensitive value estimates. These are integrated into Proximal Policy Optimization (PPO) to derive our method, Distributional Proximal Policy Optimization (DPPO). The risk preference, ranging from risk-averse to risk-seeking, can be controlled by a single parameter, which enables to adjust the robot's behavior dynamically. Importantly, our approach removes the need for additional reward function tuning to achieve risk sensitivity. We show emergent risk sensitive locomotion behavior in simulation and on the quadrupedal robot ANYmal.

Via

Access Paper or Ask Questions

PyPose v0.6: The Imperative Programming Interface for Robotics

Sep 22, 2023

Zitong Zhan, Xiangfu Li, Qihang Li, Haonan He, Abhinav Pandey, Haitao Xiao, Yangmengfei Xu, Xiangyu Chen, Kuan Xu, Kun Cao(+26 more)

Figure 1 for PyPose v0.6: The Imperative Programming Interface for Robotics

Figure 2 for PyPose v0.6: The Imperative Programming Interface for Robotics

Figure 3 for PyPose v0.6: The Imperative Programming Interface for Robotics

Figure 4 for PyPose v0.6: The Imperative Programming Interface for Robotics

Abstract:PyPose is an open-source library for robot learning. It combines a learning-based approach with physics-based optimization, which enables seamless end-to-end robot learning. It has been used in many tasks due to its meticulously designed application programming interface (API) and efficient implementation. From its initial launch in early 2022, PyPose has experienced significant enhancements, incorporating a wide variety of new features into its platform. To satisfy the growing demand for understanding and utilizing the library and reduce the learning curve of new users, we present the fundamental design principle of the imperative programming interface, and showcase the flexible usage of diverse functionalities and modules using an extremely simple Dubins car example. We also demonstrate that the PyPose can be easily used to navigate a real quadruped robot with a few lines of code.

Via

Access Paper or Ask Questions

Towards Autonomous Excavation Planning

Aug 22, 2023

Lorenzo Terenzi, Marco Hutter

Figure 1 for Towards Autonomous Excavation Planning

Figure 2 for Towards Autonomous Excavation Planning

Figure 3 for Towards Autonomous Excavation Planning

Figure 4 for Towards Autonomous Excavation Planning

Abstract:Excavation plans are crucial in construction projects, dictating the dirt disposal strategy and excavation sequence based on the final geometry and machinery available. While most construction processes rely heavily on coarse sequence planning and local execution planning driven by human expertise and intuition, fully automated planning tools are notably absent from the industry. This paper introduces a fully autonomous excavation planning system. Initially, the site is mapped, followed by user selection of the desired excavation geometry. The system then invokes a global planner to determine the sequence of poses for the excavator, ensuring complete site coverage. For each pose, a local excavation planner decides how to move the soil around the machine, and a digging planner subsequently dictates the sequence of digging trajectories to complete a patch. We showcased our system by autonomously excavating the largest pit documented so far, achieving an average digging cycle time of roughly 30 seconds, comparable to the one of a human operator.

Via

Access Paper or Ask Questions

Versatile Multi-Contact Planning and Control for Legged Loco-Manipulation

Aug 17, 2023

Jean-Pierre Sleiman, Farbod Farshidian, Marco Hutter

Abstract:Loco-manipulation planning skills are pivotal for expanding the utility of robots in everyday environments. These skills can be assessed based on a system's ability to coordinate complex holistic movements and multiple contact interactions when solving different tasks. However, existing approaches have been merely able to shape such behaviors with hand-crafted state machines, densely engineered rewards, or pre-recorded expert demonstrations. Here, we propose a minimally-guided framework that automatically discovers whole-body trajectories jointly with contact schedules for solving general loco-manipulation tasks in pre-modeled environments. The key insight is that multi-modal problems of this nature can be formulated and treated within the context of integrated Task and Motion Planning (TAMP). An effective bilevel search strategy is achieved by incorporating domain-specific rules and adequately combining the strengths of different planning techniques: trajectory optimization and informed graph search coupled with sampling-based planning. We showcase emergent behaviors for a quadrupedal mobile manipulator exploiting both prehensile and non-prehensile interactions to perform real-world tasks such as opening/closing heavy dishwashers and traversing spring-loaded doors. These behaviors are also deployed on the real system using a two-layer whole-body tracking controller.

* Science Robotics, 16 Aug 2023, Vol 8, Issue 81

Via

Access Paper or Ask Questions