Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Wolfram Burgard

Robot Skill Generalization via Keypoint Integrated Soft Actor-Critic Gaussian Mixture Models

Oct 23, 2023

Iman Nematollahi, Kirill Yankov, Wolfram Burgard, Tim Welschehold

Abstract:A long-standing challenge for a robotic manipulation system operating in real-world scenarios is adapting and generalizing its acquired motor skills to unseen environments. We tackle this challenge employing hybrid skill models that integrate imitation and reinforcement paradigms, to explore how the learning and adaptation of a skill, along with its core grounding in the scene through a learned keypoint, can facilitate such generalization. To that end, we develop Keypoint Integrated Soft Actor-Critic Gaussian Mixture Models (KIS-GMM) approach that learns to predict the reference of a dynamical system within the scene as a 3D keypoint, leveraging visual observations obtained by the robot's physical interactions during skill learning. Through conducting comprehensive evaluations in both simulated and real-world environments, we show that our method enables a robot to gain a significant zero-shot generalization to novel environments and to refine skills in the target environments faster than learning from scratch. Importantly, this is achieved without the need for new ground truth data. Moreover, our method effectively copes with scene displacements.

* Accepted at the International Symposium on Experimental Robotics (ISER) 2023. Videos at http://kis-gmm.cs.uni-freiburg.de/

Via

Access Paper or Ask Questions

Open X-Embodiment: Robotic Learning Datasets and RT-X Models

Oct 17, 2023

Open X-Embodiment Collaboration, Abhishek Padalkar, Acorn Pooley, Ajinkya Jain, Alex Bewley, Alex Herzog, Alex Irpan, Alexander Khazatsky, Anant Rai, Anikait Singh(+167 more)

Figure 1 for Open X-Embodiment: Robotic Learning Datasets and RT-X Models

Figure 2 for Open X-Embodiment: Robotic Learning Datasets and RT-X Models

Figure 3 for Open X-Embodiment: Robotic Learning Datasets and RT-X Models

Figure 4 for Open X-Embodiment: Robotic Learning Datasets and RT-X Models

Abstract:Large, high-capacity models trained on diverse datasets have shown remarkable successes on efficiently tackling downstream applications. In domains from NLP to Computer Vision, this has led to a consolidation of pretrained models, with general pretrained backbones serving as a starting point for many applications. Can such a consolidation happen in robotics? Conventionally, robotic learning methods train a separate model for every application, every robot, and even every environment. Can we instead train generalist X-robot policy that can be adapted efficiently to new robots, tasks, and environments? In this paper, we provide datasets in standardized data formats and models to make it possible to explore this possibility in the context of robotic manipulation, alongside experimental results that provide an example of effective X-robot policies. We assemble a dataset from 22 different robots collected through a collaboration between 21 institutions, demonstrating 527 skills (160266 tasks). We show that a high-capacity model trained on this data, which we call RT-X, exhibits positive transfer and improves the capabilities of multiple robots by leveraging experience from other platforms. More details can be found on the project website $\href{https://robotics-transformer-x.github.io}{\text{robotics-transformer-x.github.io}}$.

Via

Access Paper or Ask Questions

Care3D: An Active 3D Object Detection Dataset of Real Robotic-Care Environments

Oct 09, 2023

Michael G. Adam, Sebastian Eger, Martin Piccolrovazzi, Maged Iskandar, Joern Vogel, Alexander Dietrich, Seongjien Bien, Jon Skerlj, Abdeldjallil Naceri, Eckehard Steinbach(+3 more)

Figure 1 for Care3D: An Active 3D Object Detection Dataset of Real Robotic-Care Environments

Figure 2 for Care3D: An Active 3D Object Detection Dataset of Real Robotic-Care Environments

Figure 3 for Care3D: An Active 3D Object Detection Dataset of Real Robotic-Care Environments

Figure 4 for Care3D: An Active 3D Object Detection Dataset of Real Robotic-Care Environments

Abstract:As labor shortage increases in the health sector, the demand for assistive robotics grows. However, the needed test data to develop those robots is scarce, especially for the application of active 3D object detection, where no real data exists at all. This short paper counters this by introducing such an annotated dataset of real environments. The captured environments represent areas which are already in use in the field of robotic health care research. We further provide ground truth data within one room, for assessing SLAM algorithms running directly on a health care robot.

Via

Access Paper or Ask Questions

LAN-grasp: Using Large Language Models for Semantic Object Grasping

Oct 08, 2023

Reihaneh Mirjalili, Michael Krawez, Simone Silenzi, Yannik Blei, Wolfram Burgard

Figure 1 for LAN-grasp: Using Large Language Models for Semantic Object Grasping

Figure 2 for LAN-grasp: Using Large Language Models for Semantic Object Grasping

Figure 3 for LAN-grasp: Using Large Language Models for Semantic Object Grasping

Figure 4 for LAN-grasp: Using Large Language Models for Semantic Object Grasping

Abstract:In this paper, we propose LAN-grasp, a novel approach towards more appropriate semantic grasping. We use foundation models to provide the robot with a deeper understanding of the objects, the right place to grasp an object, or even the parts to avoid. This allows our robot to grasp and utilize objects in a more meaningful and safe manner. We leverage the combination of a Large Language Model, a Vision Language Model, and a traditional grasp planner to generate grasps demonstrating a deeper semantic understanding of the objects. We first prompt the Large Language Model about which object part is appropriate for grasping. Next, the Vision Language Model identifies the corresponding part in the object image. Finally, we generate grasp proposals in the region proposed by the Vision Language Model. Building on foundation models provides us with a zero-shot grasp method that can handle a wide range of objects without the need for further training or fine-tuning. We evaluated our method in real-world experiments on a custom object data set. We present the results of a survey that asks the participants to choose an object part appropriate for grasping. The results show that the grasps generated by our method are consistently ranked higher by the participants than those generated by a conventional grasping planner and a recent semantic grasping approach.

Via

Access Paper or Ask Questions

Collaborative Dynamic 3D Scene Graphs for Automated Driving

Sep 19, 2023

Elias Greve, Martin Büchner, Niclas Vödisch, Wolfram Burgard, Abhinav Valada

Figure 1 for Collaborative Dynamic 3D Scene Graphs for Automated Driving

Figure 2 for Collaborative Dynamic 3D Scene Graphs for Automated Driving

Figure 3 for Collaborative Dynamic 3D Scene Graphs for Automated Driving

Figure 4 for Collaborative Dynamic 3D Scene Graphs for Automated Driving

Abstract:Maps have played an indispensable role in enabling safe and automated driving. Although there have been many advances on different fronts ranging from SLAM to semantics, building an actionable hierarchical semantic representation of urban dynamic scenes from multiple agents is still a challenging problem. In this work, we present Collaborative URBan Scene Graphs (CURB-SG) that enable higher-order reasoning and efficient querying for many functions of automated driving. CURB-SG leverages panoptic LiDAR data from multiple agents to build large-scale maps using an effective graph-based collaborative SLAM approach that detects inter-agent loop closures. To semantically decompose the obtained 3D map, we build a lane graph from the paths of ego agents and their panoptic observations of other vehicles. Based on the connectivity of the lane graph, we segregate the environment into intersecting and non-intersecting road areas. Subsequently, we construct a multi-layered scene graph that includes lane information, the position of static landmarks and their assignment to certain map sections, other vehicles observed by the ego agents, and the pose graph from SLAM including 3D panoptic point clouds. We extensively evaluate CURB-SG in urban scenarios using a photorealistic simulator. We release our code at http://curb.cs.uni-freiburg.de.

* Refined manuscript and extended supplementary

Via

Access Paper or Ask Questions

Few-Shot Panoptic Segmentation With Foundation Models

Sep 19, 2023

Markus Käppeler, Kürsat Petek, Niclas Vödisch, Wolfram Burgard, Abhinav Valada

Abstract:Current state-of-the-art methods for panoptic segmentation require an immense amount of annotated training data that is both arduous and expensive to obtain posing a significant challenge for their widespread adoption. Concurrently, recent breakthroughs in visual representation learning have sparked a paradigm shift leading to the advent of large foundation models that can be trained with completely unlabeled images. In this work, we propose to leverage such task-agnostic image features to enable few-shot panoptic segmentation by presenting Segmenting Panoptic Information with Nearly 0 labels (SPINO). In detail, our method combines a DINOv2 backbone with lightweight network heads for semantic segmentation and boundary estimation. We show that our approach, albeit being trained with only ten annotated images, predicts high-quality pseudo-labels that can be used with any existing panoptic segmentation method. Notably, we demonstrate that SPINO achieves competitive results compared to fully supervised baselines while using less than 0.3% of the ground truth labels, paving the way for learning complex visual recognition tasks leveraging foundation models. To illustrate its general applicability, we further deploy SPINO on real-world robotic vision systems for both outdoor and indoor environments. To foster future research, we make the code and trained models publicly available at http://spino.cs.uni-freiburg.de.

Via

Access Paper or Ask Questions

A Smart Robotic System for Industrial Plant Supervision

Sep 01, 2023

D. Adriana Gómez-Rosal, Max Bergau, Georg K. J. Fischer, Andreas Wachaja, Johannes Gräter, Matthias Odenweller, Uwe Piechottka, Fabian Hoeflinger, Nikhil Gosala, Niklas Wetzel(+3 more)

Figure 1 for A Smart Robotic System for Industrial Plant Supervision

Figure 2 for A Smart Robotic System for Industrial Plant Supervision

Figure 3 for A Smart Robotic System for Industrial Plant Supervision

Abstract:In today's chemical plants, human field operators perform frequent integrity checks to guarantee high safety standards, and thus are possibly the first to encounter dangerous operating conditions. To alleviate their task, we present a system consisting of an autonomously navigating robot integrated with various sensors and intelligent data processing. It is able to detect methane leaks and estimate its flow rate, detect more general gas anomalies, recognize oil films, localize sound sources and detect failure cases, map the environment in 3D, and navigate autonomously, employing recognition and avoidance of dynamic obstacles. We evaluate our system at a wastewater facility in full working conditions. Our results demonstrate that the system is able to robustly navigate the plant and provide useful information about critical operating conditions.

* Final submission for IEEE Sensors 2023

Via

Access Paper or Ask Questions

POV-SLAM: Probabilistic Object-Aware Variational SLAM in Semi-Static Environments

Jul 02, 2023

Jingxing Qian, Veronica Chatrath, James Servos, Aaron Mavrinac, Wolfram Burgard, Steven L. Waslander, Angela P. Schoellig

Figure 1 for POV-SLAM: Probabilistic Object-Aware Variational SLAM in Semi-Static Environments

Figure 2 for POV-SLAM: Probabilistic Object-Aware Variational SLAM in Semi-Static Environments

Figure 3 for POV-SLAM: Probabilistic Object-Aware Variational SLAM in Semi-Static Environments

Figure 4 for POV-SLAM: Probabilistic Object-Aware Variational SLAM in Semi-Static Environments

Abstract:Simultaneous localization and mapping (SLAM) in slowly varying scenes is important for long-term robot task completion. Failing to detect scene changes may lead to inaccurate maps and, ultimately, lost robots. Classical SLAM algorithms assume static scenes, and recent works take dynamics into account, but require scene changes to be observed in consecutive frames. Semi-static scenes, wherein objects appear, disappear, or move slowly over time, are often overlooked, yet are critical for long-term operation. We propose an object-aware, factor-graph SLAM framework that tracks and reconstructs semi-static object-level changes. Our novel variational expectation-maximization strategy is used to optimize factor graphs involving a Gaussian-Uniform bimodal measurement likelihood for potentially-changing objects. We evaluate our approach alongside the state-of-the-art SLAM solutions in simulation and on our novel real-world SLAM dataset captured in a warehouse over four months. Our method improves the robustness of localization in the presence of semi-static changes, providing object-level reasoning about the scene.

* Published in Robotics: Science and Systems (RSS) 2023

Via

Access Paper or Ask Questions

Geometric Regularity with Robot Intrinsic Symmetry in Reinforcement Learning

Jun 28, 2023

Shengchao Yan, Yuan Zhang, Baohe Zhang, Joschka Boedecker, Wolfram Burgard

Figure 1 for Geometric Regularity with Robot Intrinsic Symmetry in Reinforcement Learning

Figure 2 for Geometric Regularity with Robot Intrinsic Symmetry in Reinforcement Learning

Figure 3 for Geometric Regularity with Robot Intrinsic Symmetry in Reinforcement Learning

Figure 4 for Geometric Regularity with Robot Intrinsic Symmetry in Reinforcement Learning

Abstract:Geometric regularity, which leverages data symmetry, has been successfully incorporated into deep learning architectures such as CNNs, RNNs, GNNs, and Transformers. While this concept has been widely applied in robotics to address the curse of dimensionality when learning from high-dimensional data, the inherent reflectional and rotational symmetry of robot structures has not been adequately explored. Drawing inspiration from cooperative multi-agent reinforcement learning, we introduce novel network structures for deep learning algorithms that explicitly capture this geometric regularity. Moreover, we investigate the relationship between the geometric prior and the concept of Parameter Sharing in multi-agent reinforcement learning. Through experiments conducted on various challenging continuous control tasks, we demonstrate the significant potential of the proposed geometric regularity in enhancing robot learning capabilities.

* accepted by RSS 2023 Workshop on Symmetries in Robot Learning

Via

Access Paper or Ask Questions

AutoGraph: Predicting Lane Graphs from Traffic Observations

Jun 27, 2023

Jannik Zürn, Ingmar Posner, Wolfram Burgard

Figure 1 for AutoGraph: Predicting Lane Graphs from Traffic Observations

Figure 2 for AutoGraph: Predicting Lane Graphs from Traffic Observations

Figure 3 for AutoGraph: Predicting Lane Graphs from Traffic Observations

Figure 4 for AutoGraph: Predicting Lane Graphs from Traffic Observations

Abstract:Lane graph estimation is a long-standing problem in the context of autonomous driving. Previous works aimed at solving this problem by relying on large-scale, hand-annotated lane graphs, introducing a data bottleneck for training models to solve this task. To overcome this limitation, we propose to use the motion patterns of traffic participants as lane graph annotations. In our AutoGraph approach, we employ a pre-trained object tracker to collect the tracklets of traffic participants such as vehicles and trucks. Based on the location of these tracklets, we predict the successor lane graph from an initial position using overhead RGB images only, not requiring any human supervision. In a subsequent stage, we show how the individual successor predictions can be aggregated into a consistent lane graph. We demonstrate the efficacy of our approach on the UrbanLaneGraph dataset and perform extensive quantitative and qualitative evaluations, indicating that AutoGraph is on par with models trained on hand-annotated graph data. Model and dataset will be made available at redacted-for-review.

* 8 pages, 6 figures

Via

Access Paper or Ask Questions