Abstract:Agricultural robotics has emerged as a critical solution to the labor shortages and rising costs associated with manual crop harvesting. Bell pepper harvesting, in particular, is a labor-intensive task, accounting for up to 50% of total production costs. While automated solutions have shown promise in controlled greenhouse environments, harvesting in unstructured outdoor farms remains an open challenge due to environmental variability and occlusion. This paper presents VADER (Vision-guided Autonomous Dual-arm Extraction Robot), a dual-arm mobile manipulation system designed specifically for the autonomous harvesting of bell peppers in outdoor environments. The system integrates a robust perception pipeline coupled with a dual-arm planning framework that coordinates a gripping arm and a cutting arm for extraction. We validate the system through trials in various realistic conditions, demonstrating a harvest success rate exceeding 60% with a cycle time of under 100 seconds per fruit, while also featuring a teleoperation fail-safe based on the GELLO teleoperation framework to ensure robustness. To support robust perception, we contribute a hierarchically structured dataset of over 3,200 images spanning indoor and outdoor domains, pairing wide-field scene images with close-up pepper images to enable a coarse-to-fine training strategy from fruit detection to high-precision pose estimation. The code and dataset will be made publicly available upon acceptance.
Abstract:Dexterous manipulation enables robots to purposefully alter the physical world, transforming them from passive observers into active agents in unstructured environments. This capability is the cornerstone of physical artificial intelligence. Despite decades of advances in hardware, perception, control, and learning, progress toward general manipulation systems remains fragmented due to the absence of widely adopted standard benchmarks. The central challenge lies in reconciling the variability of the real world with the reproducibility and authenticity required for rigorous scientific evaluation. To address this, we introduce ManipulationNet, a global infrastructure that hosts real-world benchmark tasks for robotic manipulation. ManipulationNet delivers reproducible task setups through standardized hardware kits, and enables distributed performance evaluation via a unified software client that delivers real-time task instructions and collects benchmarking results. As a persistent and scalable infrastructure, ManipulationNet organizes benchmark tasks into two complementary tracks: 1) the Physical Skills Track, which evaluates low-level physical interaction skills, and 2) the Embodied Reasoning Track, which tests high-level reasoning and multimodal grounding abilities. This design fosters the systematic growth of an interconnected network of real-world abilities and skills, paving the path toward general robotic manipulation. By enabling comparable manipulation research in the real world at scale, this infrastructure establishes a sustainable foundation for measuring long-term scientific progress and identifying capabilities ready for real-world deployment.




Abstract:We automate soft robotic hand design iteration by co-optimizing design and control policy for dexterous manipulation skills in simulation. Our design iteration pipeline combines genetic algorithms and policy transfer to learn control policies for nearly 400 hand designs, testing grasp quality under external force disturbances. We validate the optimized designs in the real world through teleoperation of pickup and reorient manipulation tasks. Our real world evaluation, from over 900 teleoperated tasks, shows that the trend in design performance in simulation resembles that of the real world. Furthermore, we show that optimized hand designs from our approach outperform existing soft robot hands from prior work in the real world. The results highlight the usefulness of simulation in guiding parameter choices for anthropomorphic soft robotic hand systems, and the effectiveness of our automated design iteration approach, despite the sim-to-real gap.




Abstract:Modeling and simulating soft robot hands can aid in design iteration for complex and high degree-of-freedom (DoF) morphologies. This can be further supplemented by iterating on the design based on its performance in real world manipulation tasks. However, this requires a framework that allows us to iterate quickly at low costs. In this paper, we present a framework that leverages rapid prototyping of the hand using 3D-printing, and utilizes teleoperation to evaluate the hand in real world manipulation tasks. Using this framework, we design a 3D-printed 16-DoF dexterous anthropomorphic soft hand (DASH) and iteratively improve its design over three iterations. Rapid prototyping techniques such as 3D-printing allow us to directly evaluate the fabricated hand without modeling it in simulation. We show that the design is improved at each iteration through the hand's performance in 30 real-world teleoperated manipulation tasks. Testing over 600 demonstrations shows that our final version of DASH can solve 16 of the 30 tasks compared to Allegro, a popular rigid hand in the market, which can only solve 7 tasks. We open-source our CAD models as well as the teleoperated dataset for further study and are available on our website (https://dash-through-interaction.github.io.)