Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ingmar Posner

Gaitor: Learning a Unified Representation Across Gaits for Real-World Quadruped Locomotion

May 29, 2024

Alexander L. Mitchell, Wolfgang Merkt, Aristotelis Papatheodorou, Ioannis Havoutis, Ingmar Posner

Figure 1 for Gaitor: Learning a Unified Representation Across Gaits for Real-World Quadruped Locomotion

Figure 2 for Gaitor: Learning a Unified Representation Across Gaits for Real-World Quadruped Locomotion

Figure 3 for Gaitor: Learning a Unified Representation Across Gaits for Real-World Quadruped Locomotion

Figure 4 for Gaitor: Learning a Unified Representation Across Gaits for Real-World Quadruped Locomotion

Abstract:The current state-of-the-art in quadruped locomotion is able to produce robust motion for terrain traversal but requires the segmentation of a desired robot trajectory into a discrete set of locomotion skills such as trot and crawl. In contrast, in this work we demonstrate the feasibility of learning a single, unified representation for quadruped locomotion enabling continuous blending between gait types and characteristics. We present Gaitor, which learns a disentangled representation of locomotion skills, thereby sharing information common to all gait types seen during training. The structure emerging in the learnt representation is interpretable in that it is found to encode phase correlations between the different gait types. These can be leveraged to produce continuous gait transitions. In addition, foot swing characteristics are disentangled and directly addressable. Together with a rudimentary terrain encoding and a learned planner operating in this structured latent representation, Gaitor is able to take motion commands including desired gait type and characteristics from a user while reacting to uneven terrain. We evaluate Gaitor in both simulated and real-world settings on the ANYmal C platform. To the best of our knowledge, this is the first work learning such a unified and interpretable latent representation for multiple gaits, resulting in on-demand continuous blending between different locomotion modes on a real quadruped robot.

* 10 pages, 8 figures, 2 tables

Via

Access Paper or Ask Questions

Compete and Compose: Learning Independent Mechanisms for Modular World Models

Apr 23, 2024

Anson Lei, Frederik Nolte, Bernhard Schölkopf, Ingmar Posner

Figure 1 for Compete and Compose: Learning Independent Mechanisms for Modular World Models

Figure 2 for Compete and Compose: Learning Independent Mechanisms for Modular World Models

Figure 3 for Compete and Compose: Learning Independent Mechanisms for Modular World Models

Figure 4 for Compete and Compose: Learning Independent Mechanisms for Modular World Models

Abstract:We present COmpetitive Mechanisms for Efficient Transfer (COMET), a modular world model which leverages reusable, independent mechanisms across different environments. COMET is trained on multiple environments with varying dynamics via a two-step process: competition and composition. This enables the model to recognise and learn transferable mechanisms. Specifically, in the competition phase, COMET is trained with a winner-takes-all gradient allocation, encouraging the emergence of independent mechanisms. These are then re-used in the composition phase, where COMET learns to re-compose learnt mechanisms in ways that capture the dynamics of intervened environments. In so doing, COMET explicitly reuses prior knowledge, enabling efficient and interpretable adaptation. We evaluate COMET on environments with image-based observations. In contrast to competitive baselines, we demonstrate that COMET captures recognisable mechanisms without supervision. Moreover, we show that COMET is able to adapt to new environments with varying numbers of objects with improved sample efficiency compared to more conventional finetuning approaches.

Via

Access Paper or Ask Questions

D-Cubed: Latent Diffusion Trajectory Optimisation for Dexterous Deformable Manipulation

Mar 19, 2024

Jun Yamada, Shaohong Zhong, Jack Collins, Ingmar Posner

Figure 1 for D-Cubed: Latent Diffusion Trajectory Optimisation for Dexterous Deformable Manipulation

Figure 2 for D-Cubed: Latent Diffusion Trajectory Optimisation for Dexterous Deformable Manipulation

Figure 3 for D-Cubed: Latent Diffusion Trajectory Optimisation for Dexterous Deformable Manipulation

Figure 4 for D-Cubed: Latent Diffusion Trajectory Optimisation for Dexterous Deformable Manipulation

Abstract:Mastering dexterous robotic manipulation of deformable objects is vital for overcoming the limitations of parallel grippers in real-world applications. Current trajectory optimisation approaches often struggle to solve such tasks due to the large search space and the limited task information available from a cost function. In this work, we propose D-Cubed, a novel trajectory optimisation method using a latent diffusion model (LDM) trained from a task-agnostic play dataset to solve dexterous deformable object manipulation tasks. D-Cubed learns a skill-latent space that encodes short-horizon actions in the play dataset using a VAE and trains a LDM to compose the skill latents into a skill trajectory, representing a long-horizon action trajectory in the dataset. To optimise a trajectory for a target task, we introduce a novel gradient-free guided sampling method that employs the Cross-Entropy method within the reverse diffusion process. In particular, D-Cubed samples a small number of noisy skill trajectories using the LDM for exploration and evaluates the trajectories in simulation. Then, D-Cubed selects the trajectory with the lowest cost for the subsequent reverse process. This effectively explores promising solution areas and optimises the sampled trajectories towards a target task throughout the reverse diffusion process. Through empirical evaluation on a public benchmark of dexterous deformable object manipulation tasks, we demonstrate that D-Cubed outperforms traditional trajectory optimisation and competitive baseline approaches by a significant margin. We further demonstrate that trajectories found by D-Cubed readily transfer to a real-world LEAP hand on a folding task.

* https://applied-ai-lab.github.io/D-cubed/

Via

Access Paper or Ask Questions

DreamUp3D: Object-Centric Generative Models for Single-View 3D Scene Understanding and Real-to-Sim Transfer

Feb 26, 2024

Yizhe Wu, Haitz Sáez de Ocáriz Borde, Jack Collins, Oiwi Parker Jones, Ingmar Posner

Figure 1 for DreamUp3D: Object-Centric Generative Models for Single-View 3D Scene Understanding and Real-to-Sim Transfer

Figure 2 for DreamUp3D: Object-Centric Generative Models for Single-View 3D Scene Understanding and Real-to-Sim Transfer

Figure 3 for DreamUp3D: Object-Centric Generative Models for Single-View 3D Scene Understanding and Real-to-Sim Transfer

Figure 4 for DreamUp3D: Object-Centric Generative Models for Single-View 3D Scene Understanding and Real-to-Sim Transfer

Abstract:3D scene understanding for robotic applications exhibits a unique set of requirements including real-time inference, object-centric latent representation learning, accurate 6D pose estimation and 3D reconstruction of objects. Current methods for scene understanding typically rely on a combination of trained models paired with either an explicit or learnt volumetric representation, all of which have their own drawbacks and limitations. We introduce DreamUp3D, a novel Object-Centric Generative Model (OCGM) designed explicitly to perform inference on a 3D scene informed only by a single RGB-D image. DreamUp3D is a self-supervised model, trained end-to-end, and is capable of segmenting objects, providing 3D object reconstructions, generating object-centric latent representations and accurate per-object 6D pose estimates. We compare DreamUp3D to baselines including NeRFs, pre-trained CLIP-features, ObSurf, and ObPose, in a range of tasks including 3D scene reconstruction, object matching and object pose estimation. Our experiments show that our model outperforms all baselines by a significant margin in real-world scenarios displaying its applicability for 3D scene understanding tasks while meeting the strict demands exhibited in robotics applications.

Via

Access Paper or Ask Questions

World Models via Policy-Guided Trajectory Diffusion

Dec 17, 2023

Marc Rigter, Jun Yamada, Ingmar Posner

Figure 1 for World Models via Policy-Guided Trajectory Diffusion

Figure 2 for World Models via Policy-Guided Trajectory Diffusion

Figure 3 for World Models via Policy-Guided Trajectory Diffusion

Figure 4 for World Models via Policy-Guided Trajectory Diffusion

Abstract:World models are a powerful tool for developing intelligent agents. By predicting the outcome of a sequence of actions, world models enable policies to be optimised via on-policy reinforcement learning (RL) using synthetic data, i.e. in "in imagination". Existing world models are autoregressive in that they interleave predicting the next state with sampling the next action from the policy. Prediction error inevitably compounds as the trajectory length grows. In this work, we propose a novel world modelling approach that is not autoregressive and generates entire on-policy trajectories in a single pass through a diffusion model. Our approach, Policy-Guided Trajectory Diffusion (PolyGRAD), leverages a denoising model in addition to the gradient of the action distribution of the policy to diffuse a trajectory of initially random states and actions into an on-policy synthetic trajectory. We analyse the connections between PolyGRAD, score-based generative models, and classifier-guided diffusion models. Our results demonstrate that PolyGRAD outperforms state-of-the-art baselines in terms of trajectory prediction error for moderate-length trajectories, with the exception of autoregressive diffusion. At short horizons, PolyGRAD obtains comparable errors to autoregressive diffusion, but with significantly lower computational requirements. Our experiments also demonstrate that PolyGRAD enables performant policies to be trained via on-policy RL in imagination for MuJoCo continuous control domains. Thus, PolyGRAD introduces a new paradigm for scalable and non-autoregressive on-policy world modelling.

Via

Access Paper or Ask Questions

TWIST: Teacher-Student World Model Distillation for Efficient Sim-to-Real Transfer

Nov 07, 2023

Jun Yamada, Marc Rigter, Jack Collins, Ingmar Posner

Abstract:Model-based RL is a promising approach for real-world robotics due to its improved sample efficiency and generalization capabilities compared to model-free RL. However, effective model-based RL solutions for vision-based real-world applications require bridging the sim-to-real gap for any world model learnt. Due to its significant computational cost, standard domain randomisation does not provide an effective solution to this problem. This paper proposes TWIST (Teacher-Student World Model Distillation for Sim-to-Real Transfer) to achieve efficient sim-to-real transfer of vision-based model-based RL using distillation. Specifically, TWIST leverages state observations as readily accessible, privileged information commonly garnered from a simulator to significantly accelerate sim-to-real transfer. Specifically, a teacher world model is trained efficiently on state information. At the same time, a matching dataset is collected of domain-randomised image observations. The teacher world model then supervises a student world model that takes the domain-randomised image observations as input. By distilling the learned latent dynamics model from the teacher to the student model, TWIST achieves efficient and effective sim-to-real transfer for vision-based model-based RL tasks. Experiments in simulated and real robotics tasks demonstrate that our approach outperforms naive domain randomisation and model-free methods in terms of sample efficiency and task performance of sim-to-real transfer.

* 7 pages, 6 figures

Via

Access Paper or Ask Questions

Neural Latent Geometry Search: Product Manifold Inference via Gromov-Hausdorff-Informed Bayesian Optimization

Sep 20, 2023

Haitz Saez de Ocariz Borde, Alvaro Arroyo, Ismael Morales, Ingmar Posner, Xiaowen Dong

Abstract:Recent research indicates that the performance of machine learning models can be improved by aligning the geometry of the latent space with the underlying data structure. Rather than relying solely on Euclidean space, researchers have proposed using hyperbolic and spherical spaces with constant curvature, or combinations thereof, to better model the latent space and enhance model performance. However, little attention has been given to the problem of automatically identifying the optimal latent geometry for the downstream task. We mathematically define this novel formulation and coin it as neural latent geometry search (NLGS). More specifically, we introduce a principled method that searches for a latent geometry composed of a product of constant curvature model spaces with minimal query evaluations. To accomplish this, we propose a novel notion of distance between candidate latent geometries based on the Gromov-Hausdorff distance from metric geometry. In order to compute the Gromov-Hausdorff distance, we introduce a mapping function that enables the comparison of different manifolds by embedding them in a common high-dimensional ambient space. Finally, we design a graph search space based on the calculated distances between candidate manifolds and use Bayesian optimization to search for the optimal latent geometry in a query-efficient manner. This is a general method which can be applied to search for the optimal latent geometry for a variety of models and downstream tasks. Extensive experiments on synthetic and real-world datasets confirm the efficacy of our method in identifying the optimal latent geometry for multiple machine learning problems.

Via

Access Paper or Ask Questions

Gromov-Hausdorff Distances for Comparing Product Manifolds of Model Spaces

Sep 09, 2023

Haitz Saez de Ocariz Borde, Alvaro Arroyo, Ismael Morales, Ingmar Posner, Xiaowen Dong

Figure 1 for Gromov-Hausdorff Distances for Comparing Product Manifolds of Model Spaces

Figure 2 for Gromov-Hausdorff Distances for Comparing Product Manifolds of Model Spaces

Figure 3 for Gromov-Hausdorff Distances for Comparing Product Manifolds of Model Spaces

Abstract:Recent studies propose enhancing machine learning models by aligning the geometric characteristics of the latent space with the underlying data structure. Instead of relying solely on Euclidean space, researchers have suggested using hyperbolic and spherical spaces with constant curvature, or their combinations (known as product manifolds), to improve model performance. However, there exists no principled technique to determine the best latent product manifold signature, which refers to the choice and dimensionality of manifold components. To address this, we introduce a novel notion of distance between candidate latent geometries using the Gromov-Hausdorff distance from metric geometry. We propose using a graph search space that uses the estimated Gromov-Hausdorff distances to search for the optimal latent geometry. In this work we focus on providing a description of an algorithm to compute the Gromov-Hausdorff distance between model spaces and its computational implementation.

* arXiv admin note: substantial text overlap with arXiv:2309.04810

Via

Access Paper or Ask Questions

AutoGraph: Predicting Lane Graphs from Traffic Observations

Jun 27, 2023

Jannik Zürn, Ingmar Posner, Wolfram Burgard

Figure 1 for AutoGraph: Predicting Lane Graphs from Traffic Observations

Figure 2 for AutoGraph: Predicting Lane Graphs from Traffic Observations

Figure 3 for AutoGraph: Predicting Lane Graphs from Traffic Observations

Figure 4 for AutoGraph: Predicting Lane Graphs from Traffic Observations

Abstract:Lane graph estimation is a long-standing problem in the context of autonomous driving. Previous works aimed at solving this problem by relying on large-scale, hand-annotated lane graphs, introducing a data bottleneck for training models to solve this task. To overcome this limitation, we propose to use the motion patterns of traffic participants as lane graph annotations. In our AutoGraph approach, we employ a pre-trained object tracker to collect the tracklets of traffic participants such as vehicles and trucks. Based on the location of these tracklets, we predict the successor lane graph from an initial position using overhead RGB images only, not requiring any human supervision. In a subsequent stage, we show how the individual successor predictions can be aggregated into a consistent lane graph. We demonstrate the efficacy of our approach on the UrbanLaneGraph dataset and perform extensive quantitative and qualitative evaluations, indicating that AutoGraph is on par with models trained on hand-annotated graph data. Model and dataset will be made available at redacted-for-review.

* 8 pages, 6 figures

Via

Access Paper or Ask Questions

Reward-Free Curricula for Training Robust World Models

Jun 15, 2023

Marc Rigter, Minqi Jiang, Ingmar Posner

Abstract:There has been a recent surge of interest in developing generally-capable agents that can adapt to new tasks without additional training in the environment. Learning world models from reward-free exploration is a promising approach, and enables policies to be trained using imagined experience for new tasks. Achieving a general agent requires robustness across different environments. However, different environments may require different amounts of data to learn a suitable world model. In this work, we address the problem of efficiently learning robust world models in the reward-free setting. As a measure of robustness, we consider the minimax regret objective. We show that the minimax regret objective can be connected to minimising the maximum error in the world model across environments. This informs our algorithm, WAKER: Weighted Acquisition of Knowledge across Environments for Robustness. WAKER selects environments for data collection based on the estimated error of the world model for each environment. Our experiments demonstrate that WAKER outperforms naive domain randomisation, resulting in improved robustness, efficiency, and generalisation.

Via

Access Paper or Ask Questions