Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Erik Bauer

Latent Action Diffusion for Cross-Embodiment Manipulation

Jun 17, 2025

Erik Bauer, Elvis Nava, Robert K. Katzschmann

Abstract:End-to-end learning approaches offer great potential for robotic manipulation, but their impact is constrained by data scarcity and heterogeneity across different embodiments. In particular, diverse action spaces across different end-effectors create barriers for cross-embodiment learning and skill transfer. We address this challenge through diffusion policies learned in a latent action space that unifies diverse end-effector actions. We first show that we can learn a semantically aligned latent action space for anthropomorphic robotic hands, a human hand, and a parallel jaw gripper using encoders trained with a contrastive loss. Second, we show that by using our proposed latent action space for co-training on manipulation data from different end-effectors, we can utilize a single policy for multi-robot control and obtain up to 13% improved manipulation success rates, indicating successful skill transfer despite a significant embodiment gap. Our approach using latent cross-embodiment policies presents a new method to unify different action spaces across embodiments, enabling efficient multi-robot control and data sharing across robot setups. This unified representation significantly reduces the need for extensive data collection for each new robot morphology, accelerates generalization across embodiments, and ultimately facilitates more scalable and efficient robotic learning.

* 14 pages, 6 figures

Via

Access Paper or Ask Questions

mimic-one: a Scalable Model Recipe for General Purpose Robot Dexterity

Jun 13, 2025

Elvis Nava, Victoriano Montesinos, Erik Bauer, Benedek Forrai, Jonas Pai, Stefan Weirich, Stephan-Daniel Gravert, Philipp Wand, Stephan Polinski, Benjamin F. Grewe(+1 more)

Abstract:We present a diffusion-based model recipe for real-world control of a highly dexterous humanoid robotic hand, designed for sample-efficient learning and smooth fine-motor action inference. Our system features a newly designed 16-DoF tendon-driven hand, equipped with wide angle wrist cameras and mounted on a Franka Emika Panda arm. We develop a versatile teleoperation pipeline and data collection protocol using both glove-based and VR interfaces, enabling high-quality data collection across diverse tasks such as pick and place, item sorting and assembly insertion. Leveraging high-frequency generative control, we train end-to-end policies from raw sensory inputs, enabling smooth, self-correcting motions in complex manipulation scenarios. Real-world evaluations demonstrate up to 93.3% out of distribution success rates, with up to a +33.3% performance boost due to emergent self-correcting behaviors, while also revealing scaling trends in policy performance. Our results advance the state-of-the-art in dexterous robotic manipulation through a fully integrated, practical approach to hardware, learning, and real-world deployment.

Via

Access Paper or Ask Questions

An Open-Source Soft Robotic Platform for Autonomous Aerial Manipulation in the Wild

Sep 11, 2024

Erik Bauer, Marc Blöchlinger, Pascal Strauch, Arman Raayatsanati, Curdin Cavelti, Robert K. Katzschmann

Figure 1 for An Open-Source Soft Robotic Platform for Autonomous Aerial Manipulation in the Wild

Figure 2 for An Open-Source Soft Robotic Platform for Autonomous Aerial Manipulation in the Wild

Figure 3 for An Open-Source Soft Robotic Platform for Autonomous Aerial Manipulation in the Wild

Figure 4 for An Open-Source Soft Robotic Platform for Autonomous Aerial Manipulation in the Wild

Abstract:Aerial manipulation combines the versatility and speed of flying platforms with the functional capabilities of mobile manipulation, which presents significant challenges due to the need for precise localization and control. Traditionally, researchers have relied on offboard perception systems, which are limited to expensive and impractical specially equipped indoor environments. In this work, we introduce a novel platform for autonomous aerial manipulation that exclusively utilizes onboard perception systems. Our platform can perform aerial manipulation in various indoor and outdoor environments without depending on external perception systems. Our experimental results demonstrate the platform's ability to autonomously grasp various objects in diverse settings. This advancement significantly improves the scalability and practicality of aerial manipulation applications by eliminating the need for costly tracking solutions. To accelerate future research, we open source our ROS 2 software stack and custom hardware design, making our contributions accessible to the broader research community.

* Project website: https://sites.google.com/view/open-source-soft-platform/open-source-soft-robotic-platform

Via

Access Paper or Ask Questions

Autonomous Vision-based Rapid Aerial Grasping

Nov 23, 2022

Erik Bauer, Barnabas Gavin Cangan, Robert K. Katzschmann

Figure 1 for Autonomous Vision-based Rapid Aerial Grasping

Figure 2 for Autonomous Vision-based Rapid Aerial Grasping

Figure 3 for Autonomous Vision-based Rapid Aerial Grasping

Figure 4 for Autonomous Vision-based Rapid Aerial Grasping

Abstract:In a future with autonomous robots, visual and spatial perception is of utmost importance for robotic systems. Particularly for aerial robotics, there are many applications where utilizing visual perception is necessary for any real-world scenarios. Robotic aerial grasping using drones promises fast pick-and-place solutions with a large increase in mobility over other robotic solutions. Utilizing Mask R-CNN scene segmentation (detectron2), we propose a vision-based system for autonomous rapid aerial grasping which does not rely on markers for object localization and does not require the size of the object to be previously known. With spatial information from a depth camera, we generate a point cloud of the detected objects and perform geometry-based grasp planning to determine grasping points on the objects. In real-world experiments, we show that our system can localize objects with a mean error of 3 cm compared to a motion capture ground truth for distances from the object ranging from 0.5 m to 2.5 m. Similar grasping efficacy is maintained compared to a system using motion capture for object localization in experiments. With our results, we show the first use of geometry-based grasping techniques with a flying platform and aim to increase the autonomy of existing aerial manipulation platforms, bringing them further towards real-world applications in warehouses and similar environments.

* 7 pages, 10 figures, preprint of submission to IEEE International Conference on Robotics and Automation (ICRA) 2023

Via

Access Paper or Ask Questions

RAPTOR: Rapid Aerial Pickup and Transport of Objects by Robots

Mar 06, 2022

Aurel Appius, Erik Bauer, Marc Blöchlinger, Aashi Kalra, Robin Oberson, Arman Raayatsanati, Pascal Strauch, Sarath Suresh, Marco von Salis, Robert K. Katzschmann

Figure 1 for RAPTOR: Rapid Aerial Pickup and Transport of Objects by Robots

Figure 2 for RAPTOR: Rapid Aerial Pickup and Transport of Objects by Robots

Figure 3 for RAPTOR: Rapid Aerial Pickup and Transport of Objects by Robots

Figure 4 for RAPTOR: Rapid Aerial Pickup and Transport of Objects by Robots

Abstract:Rapid aerial grasping promises vast applications that utilize the dynamic picking up and placing of objects by robots. Rigid grippers traditionally used in aerial manipulators require very high precision and specific object geometries for successful grasping. We propose RAPTOR, a quadcopter platform combined with a custom Fin Ray gripper to enable a more flexible grasping of objects with different geometries, leveraging the properties of soft materials to increase the contact surface between the gripper and the objects. To reduce the communication latency, we present a novel FastDDS-based middleware solution as an alternative to ROS (Robot Operating System). We show that RAPTOR achieves an average of 83% grasping efficacy in a real-world setting for four different object geometries while moving at an average velocity of 1 m/s during grasping, which is approximately five times faster than the state-of-the-art while supporting up to four times the payload. Our results further solidify the potential of quadcopters in warehouses and other automated pick-and-place applications over longer distances where speed and robustness become essential.

* 6 pages, 8 figures, submitted to IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2022

Via

Access Paper or Ask Questions