Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Abhinav Gandhi

Utilizing Inpainting for Keypoint Detection for Vision-Based Control of Robotic Manipulators

Apr 14, 2026

Sreejani Chatterjee, Venkatesh Mullur, Abhinav Gandhi, Berk Calli

Abstract:In this paper we present a novel visual servoing framework to control a robotic manipulator in the configuration space by using purely natural visual features. Our goal is to develop methods that can robustly detect and track natural features or keypoints on robotic manipulators that would be used for vision-based control, especially for scenarios where placing external markers on the robot is not feasible or preferred at runtime. For the model training process of our data driven approach, we create a data collection pipeline where we attach ArUco markers along the robot's body, label their centers as keypoints, and then utilize an inpainting method to remove the markers and reconstruct the occluded regions. By doing so, we generate natural (markerless) robot images that are automatically labeled with the marker locations. These images are used to train a keypoint detection algorithm, which is used to control the robot configuration using natural features of the robot. Unlike the prior methods that rely on accurate camera calibration and robot models for labeling training images, our approach eliminates these dependencies through inpainting. To achieve robust keypoint detection even in the presence of occlusion, we introduce a second inpainting model, this time to utilize during runtime, that reconstructs occluded regions of the robot in real time, enabling continuous keypoint detection. To further enhance the consistency and robustness of keypoint predictions, we integrate an Unscented Kalman Filter (UKF) that refines the keypoint estimates over time, adding to stable and reliable control performance. We obtained successful control results with this model-free and purely vision-based control strategy, utilizing natural robot features in the runtime, both under full visibility and partial occlusion.

Via

Access Paper or Ask Questions

3D Shape Control of Extensible Multi-Section Soft Continuum Robots via Visual Servoing

Feb 22, 2026

Abhinav Gandhi, Shou-Shan Chiang, Cagdas D. Onal, Berk Calli

Abstract:In this paper, we propose a novel vision-based control algorithm for regulating the whole body shape of extensible multisection soft continuum manipulators. Contrary to existing vision-based control algorithms in the literature that regulate the robot's end effector pose, our proposed control algorithm regulates the robot's whole body configuration, enabling us to leverage its kinematic redundancy. Additionally, our model-based 2.5D shape visual servoing provides globally stable asymptotic convergence in the robot's 3D workspace compared to the closest works in the literature that report local minima. Unlike existing visual servoing algorithms in the literature, our approach does not require information from proprioceptive sensors, making it suitable for continuum manipulators without such capabilities. Instead, robot state is estimated from images acquired by an external camera that observes the robot's whole body shape and is also utilized to close the shape control loop. Traditionally, visual servoing schemes require an image of the robot at its reference pose to generate the reference features. In this work, we utilize an inverse kinematics solver to generate reference features for the desired robot configuration and do not require images of the robot at the reference. Experiments are performed on a multisection continuum manipulator demonstrating the controller's capability to regulate the robot's whole body shape while precisely positioning the robot's end effector. Results validate our controller's ability to regulate the shape of continuum robots while demonstrating a smooth transient response and a steady-state error within 1 mm. Proof-of-concept object manipulation experiments including stacking, pouring, and pulling tasks are performed to demonstrate our controller's applicability.

Via

Access Paper or Ask Questions

Image-Based Roadmaps for Vision-Only Planning and Control of Robotic Manipulators

Feb 26, 2025

Sreejani Chatterjee, Abhinav Gandhi, Berk Calli, Constantinos Chamzas

Figure 1 for Image-Based Roadmaps for Vision-Only Planning and Control of Robotic Manipulators

Figure 2 for Image-Based Roadmaps for Vision-Only Planning and Control of Robotic Manipulators

Figure 3 for Image-Based Roadmaps for Vision-Only Planning and Control of Robotic Manipulators

Figure 4 for Image-Based Roadmaps for Vision-Only Planning and Control of Robotic Manipulators

Abstract:This work presents a motion planning framework for robotic manipulators that computes collision-free paths directly in image space. The generated paths can then be tracked using vision-based control, eliminating the need for an explicit robot model or proprioceptive sensing. At the core of our approach is the construction of a roadmap entirely in image space. To achieve this, we explicitly define sampling, nearest-neighbor selection, and collision checking based on visual features rather than geometric models. We first collect a set of image-space samples by moving the robot within its workspace, capturing keypoints along its body at different configurations. These samples serve as nodes in the roadmap, which we construct using either learned or predefined distance metrics. At runtime, the roadmap generates collision-free paths directly in image space, removing the need for a robot model or joint encoders. We validate our approach through an experimental study in which a robotic arm follows planned paths using an adaptive vision-based control scheme to avoid obstacles. The results show that paths generated with the learned-distance roadmap achieved 100% success in control convergence, whereas the predefined image-space distance roadmap enabled faster transient responses but had a lower success rate in convergence.

Via

Access Paper or Ask Questions