Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Siddhartha Kapuria

Brian

Open-H-Embodiment: A Large-Scale Dataset for Enabling Foundation Models in Medical Robotics

Apr 22, 2026

Open-H-Embodiment Consortium, :, Nigel Nelson, Juo-Tung Chen, Jesse Haworth, Xinhao Chen, Lukas Zbinden, Dianye Huang, Alaa Eldin Abdelaal, Alberto Arezzo(+206 more)

Abstract:Autonomous medical robots hold promise to improve patient outcomes, reduce provider workload, democratize access to care, and enable superhuman precision. However, autonomous medical robotics has been limited by a fundamental data problem: existing medical robotic datasets are small, single-embodiment, and rarely shared openly, restricting the development of foundation models that the field needs to advance. We introduce Open-H-Embodiment, the largest open dataset of medical robotic video with synchronized kinematics to date, spanning more than 49 institutions and multiple robotic platforms including the CMR Versius, Intuitive Surgical's da Vinci, da Vinci Research Kit (dVRK), Rob Surgical BiTrack, Virtual Incision's MIRA, Moon Surgical Maestro, and a variety of custom systems, spanning surgical manipulation, robotic ultrasound, and endoscopy procedures. We demonstrate the research enabled by this dataset through two foundation models. GR00T-H is the first open foundation vision-language-action model for medical robotics, which is the only evaluated model to achieve full end-to-end task completion on a structured suturing benchmark (25% of trials vs. 0% for all others) and achieves 64% average success across a 29-step ex vivo suturing sequence. We also train Cosmos-H-Surgical-Simulator, the first action-conditioned world model to enable multi-embodiment surgical simulation from a single checkpoint, spanning nine robotic platforms and supporting in silico policy evaluation and synthetic data generation for the medical domain. These results suggest that open, large-scale medical robot data collection can serve as critical infrastructure for the research community, enabling advances in robot learning, world modeling, and beyond.

* Project website: https://open-h.github.io/open-h-embodiment/

Via

Access Paper or Ask Questions

OpenRC: An Open-Source Robotic Colonoscopy Framework for Multimodal Data Acquisition and Autonomy Research

Apr 04, 2026

Siddhartha Kapuria, Mohammad Rafiee Javazm, Naruhiko Ikoma, Joga Ivatury, Mohammad Ali Nasseri, Nassir Navab, Farshid Alambeigi

Abstract:Colorectal cancer screening critically depends on colonoscopy, yet existing platforms offer limited support for systematically studying the coupled dynamics of operator control, instrument motion, and visual feedback. This gap restricts reproducible closed-loop research in robotic colonoscopy, medical imaging, and emerging vision-language-action (VLA) learning paradigms. To address this challenge, we present OpenRC, an open-source modular robotic colonoscopy framework that retrofits conventional scopes while preserving clinical workflow. The framework supports simultaneous recording of video, operator commands, actuation state, and distal tip pose. We experimentally validated motion consistency and quantified cross-modal latency across sensing streams. Using this platform, we collected a multimodal dataset comprising 1,894 teleoperated episodes ~19 hours across 10 structured task variations of routine navigation, failure events, and recovery behaviors. By unifying open hardware and an aligned multimodal dataset, OpenRC provides a reproducible foundation for research in multimodal robotic colonoscopy and surgical autonomy.

Via

Access Paper or Ask Questions

Augmented Bridge Spinal Fixation: A New Concept for Addressing Pedicle Screw Pullout via a Steerable Drilling Robot and Flexible Pedicle Screws

Jul 02, 2025

Yash Kulkarni, Susheela Sharma, Omid Rezayof, Siddhartha Kapuria, Jordan P. Amadio, Mohsen Khadem, Maryam Tilton, Farshid Alambeigi

Abstract:To address the screw loosening and pullout limitations of rigid pedicle screws in spinal fixation procedures, and to leverage our recently developed Concentric Tube Steerable Drilling Robot (CT-SDR) and Flexible Pedicle Screw (FPS), in this paper, we introduce the concept of Augmented Bridge Spinal Fixation (AB-SF). In this concept, two connecting J-shape tunnels are first drilled through pedicles of vertebra using the CT-SDR. Next, two FPSs are passed through this tunnel and bone cement is then injected through the cannulated region of the FPS to form an augmented bridge between two pedicles and reinforce strength of the fixated spine. To experimentally analyze and study the feasibility of AB-SF technique, we first used our robotic system (i.e., a CT-SDR integrated with a robotic arm) to create two different fixation scenarios in which two J-shape tunnels, forming a bridge, were drilled at different depth of a vertebral phantom. Next, we implanted two FPSs within the drilled tunnels and then successfully simulated the bone cement augmentation process.

Via

Access Paper or Ask Questions

Robot-Enabled Machine Learning-Based Diagnosis of Gastric Cancer Polyps Using Partial Surface Tactile Imaging

Aug 02, 2024

Siddhartha Kapuria, Jeff Bonyun, Yash Kulkarni, Naruhiko Ikoma, Sandeep Chinchali, Farshid Alambeigi

Figure 1 for Robot-Enabled Machine Learning-Based Diagnosis of Gastric Cancer Polyps Using Partial Surface Tactile Imaging

Figure 2 for Robot-Enabled Machine Learning-Based Diagnosis of Gastric Cancer Polyps Using Partial Surface Tactile Imaging

Figure 3 for Robot-Enabled Machine Learning-Based Diagnosis of Gastric Cancer Polyps Using Partial Surface Tactile Imaging

Figure 4 for Robot-Enabled Machine Learning-Based Diagnosis of Gastric Cancer Polyps Using Partial Surface Tactile Imaging

Abstract:In this paper, to collectively address the existing limitations on endoscopic diagnosis of Advanced Gastric Cancer (AGC) Tumors, for the first time, we propose (i) utilization and evaluation of our recently developed Vision-based Tactile Sensor (VTS), and (ii) a complementary Machine Learning (ML) algorithm for classifying tumors using their textural features. Leveraging a seven DoF robotic manipulator and unique custom-designed and additively-manufactured realistic AGC tumor phantoms, we demonstrated the advantages of automated data collection using the VTS addressing the problem of data scarcity and biases encountered in traditional ML-based approaches. Our synthetic-data-trained ML model was successfully evaluated and compared with traditional ML models utilizing various statistical metrics even under mixed morphological characteristics and partial sensor contact.

Via

Access Paper or Ask Questions

A Patient-Specific Framework for Autonomous Spinal Fixation via a Steerable Drilling Robot

May 27, 2024

Susheela Sharma, Sarah Go, Zeynep Yakay, Yash Kulkarni, Siddhartha Kapuria, Jordan P. Amadio, Reza Rajebi, Mohsen Khadem, Nassir Navab, Farshid Alambeigi

Figure 1 for A Patient-Specific Framework for Autonomous Spinal Fixation via a Steerable Drilling Robot

Figure 2 for A Patient-Specific Framework for Autonomous Spinal Fixation via a Steerable Drilling Robot

Figure 3 for A Patient-Specific Framework for Autonomous Spinal Fixation via a Steerable Drilling Robot

Figure 4 for A Patient-Specific Framework for Autonomous Spinal Fixation via a Steerable Drilling Robot

Abstract:In this paper, with the goal of enhancing the minimally invasive spinal fixation procedure in osteoporotic patients, we propose a first-of-its-kind image-guided robotic framework for performing an autonomous and patient-specific procedure using a unique concentric tube steerable drilling robot (CT-SDR). Particularly, leveraging a CT-SDR, we introduce the concept of J-shape drilling based on a pre-operative trajectory planned in CT scan of a patient followed by appropriate calibration, registration, and navigation steps to safely execute this trajectory in real-time using our unique robotic setup. To thoroughly evaluate the performance of our framework, we performed several experiments on two different vertebral phantoms designed based on CT scan of real patients.

* 10 pages, 3 figures

Via

Access Paper or Ask Questions

Towards Reliable Colorectal Cancer Polyps Classification via Vision Based Tactile Sensing and Confidence-Calibrated Neural Networks

Apr 25, 2023

Siddhartha Kapuria, Tarunraj G. Mohanraj, Nethra Venkatayogi, Ozdemir Can Kara, Yuki Hirata, Patrick Minot, Ariel Kapusta, Naruhiko Ikoma, Farshid Alambeigi

Figure 1 for Towards Reliable Colorectal Cancer Polyps Classification via Vision Based Tactile Sensing and Confidence-Calibrated Neural Networks

Figure 2 for Towards Reliable Colorectal Cancer Polyps Classification via Vision Based Tactile Sensing and Confidence-Calibrated Neural Networks

Figure 3 for Towards Reliable Colorectal Cancer Polyps Classification via Vision Based Tactile Sensing and Confidence-Calibrated Neural Networks

Figure 4 for Towards Reliable Colorectal Cancer Polyps Classification via Vision Based Tactile Sensing and Confidence-Calibrated Neural Networks

Abstract:In this study, toward addressing the over-confident outputs of existing artificial intelligence-based colorectal cancer (CRC) polyp classification techniques, we propose a confidence-calibrated residual neural network. Utilizing a novel vision-based tactile sensing (VS-TS) system and unique CRC polyp phantoms, we demonstrate that traditional metrics such as accuracy and precision are not sufficient to encapsulate model performance for handling a sensitive CRC polyp diagnosis. To this end, we develop a residual neural network classifier and address its over-confident outputs for CRC polyps classification via the post-processing method of temperature scaling. To evaluate the proposed method, we introduce noise and blur to the obtained textural images of the VS-TS and test the model's reliability for non-ideal inputs through reliability diagrams and other statistical metrics.

Via

Access Paper or Ask Questions