Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Gerhard Neumann

Karlsruhe Institute of Technology

Neural Contractive Dynamical Systems

Jan 17, 2024

Hadi Beik-Mohammadi, Søren Hauberg, Georgios Arvanitidis, Nadia Figueroa, Gerhard Neumann, Leonel Rozo

Figure 1 for Neural Contractive Dynamical Systems

Figure 2 for Neural Contractive Dynamical Systems

Figure 3 for Neural Contractive Dynamical Systems

Figure 4 for Neural Contractive Dynamical Systems

Abstract:Stability guarantees are crucial when ensuring a fully autonomous robot does not take undesirable or potentially harmful actions. Unfortunately, global stability guarantees are hard to provide in dynamical systems learned from data, especially when the learned dynamics are governed by neural networks. We propose a novel methodology to learn neural contractive dynamical systems, where our neural architecture ensures contraction, and hence, global stability. To efficiently scale the method to high-dimensional dynamical systems, we develop a variant of the variational autoencoder that learns dynamics in a low-dimensional latent representation space while retaining contractive stability after decoding. We further extend our approach to learning contractive systems on the Lie group of rotations to account for full-pose end-effector dynamic motions. The result is the first highly flexible learning architecture that provides contractive stability guarantees with capability to perform obstacle avoidance. Empirically, we demonstrate that our approach encodes the desired dynamics more accurately than the current state-of-the-art, which provides less strong stability guarantees.

Via

Access Paper or Ask Questions

Domain-Specific Fine-Tuning of Large Language Models for Interactive Robot Programming

Dec 21, 2023

Benjamin Alt, Urs Keßner, Aleksandar Taranovic, Darko Katic, Andreas Hermann, Rainer Jäkel, Gerhard Neumann

Figure 1 for Domain-Specific Fine-Tuning of Large Language Models for Interactive Robot Programming

Abstract:Industrial robots are applied in a widening range of industries, but robot programming mostly remains a task limited to programming experts. We propose a natural language-based assistant for programming of advanced, industrial robotic applications and investigate strategies for domain-specific fine-tuning of foundation models with limited data and compute.

* 5 pages, 1 figure, accepted to the 2024 European Robotics Forum

Via

Access Paper or Ask Questions

Movement Primitive Diffusion: Learning Gentle Robotic Manipulation of Deformable Objects

Dec 15, 2023

Paul Maria Scheikl, Nicolas Schreiber, Christoph Haas, Niklas Freymuth, Gerhard Neumann, Rudolf Lioutikov, Franziska Mathis-Ullrich

Figure 1 for Movement Primitive Diffusion: Learning Gentle Robotic Manipulation of Deformable Objects

Figure 2 for Movement Primitive Diffusion: Learning Gentle Robotic Manipulation of Deformable Objects

Figure 3 for Movement Primitive Diffusion: Learning Gentle Robotic Manipulation of Deformable Objects

Figure 4 for Movement Primitive Diffusion: Learning Gentle Robotic Manipulation of Deformable Objects

Abstract:Policy learning in robot-assisted surgery (RAS) lacks data efficient and versatile methods that exhibit the desired motion quality for delicate surgical interventions. To this end, we introduce Movement Primitive Diffusion (MPD), a novel method for imitation learning (IL) in RAS that focuses on gentle manipulation of deformable objects. The approach combines the versatility of diffusion-based imitation learning (DIL) with the high-quality motion generation capabilities of Probabilistic Dynamic Movement Primitives (ProDMPs). This combination enables MPD to achieve gentle manipulation of deformable objects, while maintaining data efficiency critical for RAS applications where demonstration data is scarce. We evaluate MPD across various simulated tasks and a real world robotic setup on both state and image observations. MPD outperforms state-of-the-art DIL methods in success rate, motion quality, and data efficiency.

Via

Access Paper or Ask Questions

Registered and Segmented Deformable Object Reconstruction from a Single View Point Cloud

Nov 13, 2023

Pit Henrich, Balázs Gyenes, Paul Maria Scheikl, Gerhard Neumann, Franziska Mathis-Ullrich

Abstract:In deformable object manipulation, we often want to interact with specific segments of an object that are only defined in non-deformed models of the object. We thus require a system that can recognize and locate these segments in sensor data of deformed real world objects. This is normally done using deformable object registration, which is problem specific and complex to tune. Recent methods utilize neural occupancy functions to improve deformable object registration by registering to an object reconstruction. Going one step further, we propose a system that in addition to reconstruction learns segmentation of the reconstructed object. As the resulting output already contains the information about the segments, we can skip the registration process. Tested on a variety of deformable objects in simulation and the real world, we demonstrate that our method learns to robustly find these segments. We also introduce a simple sampling algorithm to generate better training data for occupancy learning.

* Accepted at WACV 2024

Via

Access Paper or Ask Questions

Latent Task-Specific Graph Network Simulators

Nov 09, 2023

Philipp Dahlinger, Niklas Freymuth, Michael Volpp, Tai Hoang, Gerhard Neumann

Figure 1 for Latent Task-Specific Graph Network Simulators

Figure 2 for Latent Task-Specific Graph Network Simulators

Figure 3 for Latent Task-Specific Graph Network Simulators

Figure 4 for Latent Task-Specific Graph Network Simulators

Abstract:Simulating dynamic physical interactions is a critical challenge across multiple scientific domains, with applications ranging from robotics to material science. For mesh-based simulations, Graph Network Simulators (GNSs) pose an efficient alternative to traditional physics-based simulators. Their inherent differentiability and speed make them particularly well-suited for inverse design problems. Yet, adapting to new tasks from limited available data is an important aspect for real-world applications that current methods struggle with. We frame mesh-based simulation as a meta-learning problem and use a recent Bayesian meta-learning method to improve GNSs adaptability to new scenarios by leveraging context data and handling uncertainties. Our approach, latent task-specific graph network simulator, uses non-amortized task posterior approximations to sample latent descriptions of unknown system properties. Additionally, we leverage movement primitives for efficient full trajectory prediction, effectively addressing the issue of accumulating errors encountered by previous auto-regressive methods. We validate the effectiveness of our approach through various experiments, performing on par with or better than established baseline methods. Movement primitives further allow us to accommodate various types of context data, as demonstrated through the utilization of point clouds during inference. By combining GNSs with meta-learning, we bring them closer to real-world applicability, particularly in scenarios with smaller datasets.

Via

Access Paper or Ask Questions

Information-Theoretic Trust Regions for Stochastic Gradient-Based Optimization

Oct 31, 2023

Philipp Dahlinger, Philipp Becker, Maximilian Hüttenrauch, Gerhard Neumann

Abstract:Stochastic gradient-based optimization is crucial to optimize neural networks. While popular approaches heuristically adapt the step size and direction by rescaling gradients, a more principled approach to improve optimizers requires second-order information. Such methods precondition the gradient using the objective's Hessian. Yet, computing the Hessian is usually expensive and effectively using second-order information in the stochastic gradient setting is non-trivial. We propose using Information-Theoretic Trust Region Optimization (arTuRO) for improved updates with uncertain second-order information. By modeling the network parameters as a Gaussian distribution and using a Kullback-Leibler divergence-based trust region, our approach takes bounded steps accounting for the objective's curvature and uncertainty in the parameters. Before each update, it solves the trust region problem for an optimal step size, resulting in a more stable and faster optimization process. We approximate the diagonal elements of the Hessian from stochastic gradients using a simple recursive least squares approach, constructing a model of the expected Hessian over time using only first-order information. We show that arTuRO combines the fast convergence of adaptive moment-based optimization with the generalization capabilities of SGD.

Via

Access Paper or Ask Questions

Multi Time Scale World Models

Oct 27, 2023

Vaisakh Shaj, Saleh Gholam Zadeh, Ozan Demir, Luiz Ricardo Douat, Gerhard Neumann

Figure 1 for Multi Time Scale World Models

Figure 2 for Multi Time Scale World Models

Figure 3 for Multi Time Scale World Models

Figure 4 for Multi Time Scale World Models

Abstract:Intelligent agents use internal world models to reason and make predictions about different courses of their actions at many scales. Devising learning paradigms and architectures that allow machines to learn world models that operate at multiple levels of temporal abstractions while dealing with complex uncertainty predictions is a major technical hurdle. In this work, we propose a probabilistic formalism to learn multi-time scale world models which we call the Multi Time Scale State Space (MTS3) model. Our model uses a computationally efficient inference scheme on multiple time scales for highly accurate long-horizon predictions and uncertainty estimates over several seconds into the future. Our experiments, which focus on action conditional long horizon future predictions, show that MTS3 outperforms recent methods on several system identification benchmarks including complex simulated and real-world dynamical systems.

* Accepted as spotlight at NeurIPS 2023

Via

Access Paper or Ask Questions

SA6D: Self-Adaptive Few-Shot 6D Pose Estimator for Novel and Occluded Objects

Aug 31, 2023

Ning Gao, Ngo Anh Vien, Hanna Ziesche, Gerhard Neumann

Abstract:To enable meaningful robotic manipulation of objects in the real-world, 6D pose estimation is one of the critical aspects. Most existing approaches have difficulties to extend predictions to scenarios where novel object instances are continuously introduced, especially with heavy occlusions. In this work, we propose a few-shot pose estimation (FSPE) approach called SA6D, which uses a self-adaptive segmentation module to identify the novel target object and construct a point cloud model of the target object using only a small number of cluttered reference images. Unlike existing methods, SA6D does not require object-centric reference images or any additional object information, making it a more generalizable and scalable solution across categories. We evaluate SA6D on real-world tabletop object datasets and demonstrate that SA6D outperforms existing FSPE methods, particularly in cluttered scenes with occlusions, while requiring fewer reference images.

* Conference on Robot Learning (CoRL), 2023

Via

Access Paper or Ask Questions

Enhancing Interpretable Object Abstraction via Clustering-based Slot Initialization

Aug 22, 2023

Ning Gao, Bernard Hohmann, Gerhard Neumann

Abstract:Object-centric representations using slots have shown the advances towards efficient, flexible and interpretable abstraction from low-level perceptual features in a compositional scene. Current approaches randomize the initial state of slots followed by an iterative refinement. As we show in this paper, the random slot initialization significantly affects the accuracy of the final slot prediction. Moreover, current approaches require a predetermined number of slots from prior knowledge of the data, which limits the applicability in the real world. In our work, we initialize the slot representations with clustering algorithms conditioned on the perceptual input features. This requires an additional layer in the architecture to initialize the slots given the identified clusters. We design permutation invariant and permutation equivariant versions of this layer to enable the exchangeable slot representations after clustering. Additionally, we employ mean-shift clustering to automatically identify the number of slots for a given scene. We evaluate our method on object discovery and novel view synthesis tasks with various datasets. The results show that our method outperforms prior works consistently, especially for complex scenes.

* The 34th British Machine Vision Conference (BMVC), 2023

Via

Access Paper or Ask Questions

DMFC-GraspNet: Differentiable Multi-Fingered Robotic Grasp Generation in Cluttered Scenes

Aug 16, 2023

Philipp Blättner, Johannes Brand, Gerhard Neumann, Ngo Anh Vien

Figure 1 for DMFC-GraspNet: Differentiable Multi-Fingered Robotic Grasp Generation in Cluttered Scenes

Figure 2 for DMFC-GraspNet: Differentiable Multi-Fingered Robotic Grasp Generation in Cluttered Scenes

Figure 3 for DMFC-GraspNet: Differentiable Multi-Fingered Robotic Grasp Generation in Cluttered Scenes

Figure 4 for DMFC-GraspNet: Differentiable Multi-Fingered Robotic Grasp Generation in Cluttered Scenes

Abstract:Robotic grasping is a fundamental skill required for object manipulation in robotics. Multi-fingered robotic hands, which mimic the structure of the human hand, can potentially perform complex object manipulation. Nevertheless, current techniques for multi-fingered robotic grasping frequently predict only a single grasp for each inference time, limiting computational efficiency and their versatility, i.e. unimodal grasp distribution. This paper proposes a differentiable multi-fingered grasp generation network (DMFC-GraspNet) with three main contributions to address this challenge. Firstly, a novel neural grasp planner is proposed, which predicts a new grasp representation to enable versatile and dense grasp predictions. Secondly, a scene creation and label mapping method is developed for dense labeling of multi-fingered robotic hands, which allows a dense association of ground truth grasps. Thirdly, we propose to train DMFC-GraspNet end-to-end using using a forward-backward automatic differentiation approach with both a supervised loss and a differentiable collision loss and a generalized Q 1 grasp metric loss. The proposed approach is evaluated using the Shadow Dexterous Hand on Mujoco simulation and ablated by different choices of loss functions. The results demonstrate the effectiveness of the proposed approach in predicting versatile and dense grasps, and in advancing the field of multi-fingered robotic grasping.

* Submitted IROS 2023 workshop "Policy Learning in Geometric Spaces"

Via

Access Paper or Ask Questions