Alert button
Picture for Juan Wachs

Juan Wachs

Alert button

Deep Kernel and Image Quality Estimators for Optimizing Robotic Ultrasound Controller using Bayesian Optimization

Oct 11, 2023
Deepak Raina, SH Chandrashekhara, Richard Voyles, Juan Wachs, Subir Kumar Saha

Figure 1 for Deep Kernel and Image Quality Estimators for Optimizing Robotic Ultrasound Controller using Bayesian Optimization
Figure 2 for Deep Kernel and Image Quality Estimators for Optimizing Robotic Ultrasound Controller using Bayesian Optimization
Figure 3 for Deep Kernel and Image Quality Estimators for Optimizing Robotic Ultrasound Controller using Bayesian Optimization
Figure 4 for Deep Kernel and Image Quality Estimators for Optimizing Robotic Ultrasound Controller using Bayesian Optimization

Ultrasound is a commonly used medical imaging modality that requires expert sonographers to manually maneuver the ultrasound probe based on the acquired image. Autonomous Robotic Ultrasound (A-RUS) is an appealing alternative to this manual procedure in order to reduce sonographers' workload. The key challenge to A-RUS is optimizing the ultrasound image quality for the region of interest across different patients. This requires knowledge of anatomy, recognition of error sources and precise probe position, orientation and pressure. Sample efficiency is important while optimizing these parameters associated with the robotized probe controller. Bayesian Optimization (BO), a sample-efficient optimization framework, has recently been applied to optimize the 2D motion of the probe. Nevertheless, further improvements are needed to improve the sample efficiency for high-dimensional control of the probe. We aim to overcome this problem by using a neural network to learn a low-dimensional kernel in BO, termed as Deep Kernel (DK). The neural network of DK is trained using probe and image data acquired during the procedure. The two image quality estimators are proposed that use a deep convolution neural network and provide real-time feedback to the BO. We validated our framework using these two feedback functions on three urinary bladder phantoms. We obtained over 50% increase in sample efficiency for 6D control of the robotized probe. Furthermore, our results indicate that this performance enhancement in BO is independent of the specific training dataset, demonstrating inter-patient adaptability.

* Accepted in IEEE International Symposium on Medical Robotics (ISMR) 2023 
Viaarxiv icon

RUSOpt: Robotic UltraSound Probe Normalization with Bayesian Optimization for In-plane and Out-plane Scanning

Oct 05, 2023
Deepak Raina, Abhishek Mathur, Richard M. Voyles, Juan Wachs, SH Chandrashekhara, Subir Kumar Saha

Figure 1 for RUSOpt: Robotic UltraSound Probe Normalization with Bayesian Optimization for In-plane and Out-plane Scanning
Figure 2 for RUSOpt: Robotic UltraSound Probe Normalization with Bayesian Optimization for In-plane and Out-plane Scanning
Figure 3 for RUSOpt: Robotic UltraSound Probe Normalization with Bayesian Optimization for In-plane and Out-plane Scanning
Figure 4 for RUSOpt: Robotic UltraSound Probe Normalization with Bayesian Optimization for In-plane and Out-plane Scanning

The one of the significant challenges faced by autonomous robotic ultrasound systems is acquiring high-quality images across different patients. The proper orientation of the robotized probe plays a crucial role in governing the quality of ultrasound images. To address this challenge, we propose a sample-efficient method to automatically adjust the orientation of the ultrasound probe normal to the point of contact on the scanning surface, thereby improving the acoustic coupling of the probe and resulting image quality. Our method utilizes Bayesian Optimization (BO) based search on the scanning surface to efficiently search for the normalized probe orientation. We formulate a novel objective function for BO that leverages the contact force measurements and underlying mechanics to identify the normal. We further incorporate a regularization scheme in BO to handle the noisy objective function. The performance of the proposed strategy has been assessed through experiments on urinary bladder phantoms. These phantoms included planar, tilted, and rough surfaces, and were examined using both linear and convex probes with varying search space limits. Further, simulation-based studies have been carried out using 3D human mesh models. The results demonstrate that the mean ($\pm$SD) absolute angular error averaged over all phantoms and 3D models is $\boldsymbol{2.4\pm0.7^\circ}$ and $\boldsymbol{2.1\pm1.3^\circ}$, respectively.

* Accepted in IEEE International Conference on Automation Science and Engineering (CASE) 2023 
Viaarxiv icon

Robotic Sonographer: Autonomous Robotic Ultrasound using Domain Expertise in Bayesian Optimization

Jul 05, 2023
Deepak Raina, SH Chandrashekhara, Richard Voyles, Juan Wachs, Subir Kumar Saha

Figure 1 for Robotic Sonographer: Autonomous Robotic Ultrasound using Domain Expertise in Bayesian Optimization
Figure 2 for Robotic Sonographer: Autonomous Robotic Ultrasound using Domain Expertise in Bayesian Optimization
Figure 3 for Robotic Sonographer: Autonomous Robotic Ultrasound using Domain Expertise in Bayesian Optimization
Figure 4 for Robotic Sonographer: Autonomous Robotic Ultrasound using Domain Expertise in Bayesian Optimization

Ultrasound is a vital imaging modality utilized for a variety of diagnostic and interventional procedures. However, an expert sonographer is required to make accurate maneuvers of the probe over the human body while making sense of the ultrasound images for diagnostic purposes. This procedure requires a substantial amount of training and up to a few years of experience. In this paper, we propose an autonomous robotic ultrasound system that uses Bayesian Optimization (BO) in combination with the domain expertise to predict and effectively scan the regions where diagnostic quality ultrasound images can be acquired. The quality map, which is a distribution of image quality in a scanning region, is estimated using Gaussian process in BO. This relies on a prior quality map modeled using expert's demonstration of the high-quality probing maneuvers. The ultrasound image quality feedback is provided to BO, which is estimated using a deep convolution neural network model. This model was previously trained on database of images labelled for diagnostic quality by expert radiologists. Experiments on three different urinary bladder phantoms validated that the proposed autonomous ultrasound system can acquire ultrasound images for diagnostic purposes with a probing position and force accuracy of 98.7% and 97.8%, respectively.

* Accepted in IEEE International Conference on Robotics and Automation (ICRA) 2023 
Viaarxiv icon

Human-centered XAI for Burn Depth Characterization

Oct 24, 2022
Maxwell J. Jacobson, Daniela Chanci Arrubla, Maria Romeo Tricas, Gayle Gordillo, Yexiang Xue, Chandan Sen, Juan Wachs

Figure 1 for Human-centered XAI for Burn Depth Characterization
Figure 2 for Human-centered XAI for Burn Depth Characterization
Figure 3 for Human-centered XAI for Burn Depth Characterization
Figure 4 for Human-centered XAI for Burn Depth Characterization

Approximately 1.25 million people in the United States are treated each year for burn injuries. Precise burn injury classification is an important aspect of the medical AI field. In this work, we propose an explainable human-in-the-loop framework for improving burn ultrasound classification models. Our framework leverages an explanation system based on the LIME classification explainer to corroborate and integrate a burn expert's knowledge -- suggesting new features and ensuring the validity of the model. Using this framework, we discover that B-mode ultrasound classifiers can be enhanced by supplying textural features. More specifically, we confirm that texture features based on the Gray Level Co-occurance Matrix (GLCM) of ultrasound frames can increase the accuracy of transfer learned burn depth classifiers. We test our hypothesis on real data from porcine subjects. We show improvements in the accuracy of burn depth classification -- from ~88% to ~94% -- once modified according to our framework.

Viaarxiv icon

Active Multi-Object Exploration and Recognition via Tactile Whiskers

Sep 08, 2021
Chenxi Xiao, Shujia Xu, Wenzhuo Wu, Juan Wachs

Figure 1 for Active Multi-Object Exploration and Recognition via Tactile Whiskers
Figure 2 for Active Multi-Object Exploration and Recognition via Tactile Whiskers
Figure 3 for Active Multi-Object Exploration and Recognition via Tactile Whiskers
Figure 4 for Active Multi-Object Exploration and Recognition via Tactile Whiskers

Robotic exploration under uncertain environments is challenging when optical information is not available. In this paper, we propose an autonomous solution of exploring an unknown task space based on tactile sensing alone. We first designed a whisker sensor based on MEMS barometer devices. This sensor can acquire contact information by interacting with the environment non-intrusively. This sensor is accompanied by a planning technique to generate exploration trajectories by using mere tactile perception. This technique relies on a hybrid policy for tactile exploration, which includes a proactive informative path planner for object searching, and a reactive Hopf oscillator for contour tracing. Results indicate that the hybrid exploration policy can increase the efficiency of object discovery. Last, scene understanding was facilitated by segmenting objects and classification. A classifier was developed to recognize the object categories based on the geometric features collected by the whisker sensor. Such an approach demonstrates the whisker sensor, together with the tactile intelligence, can provide sufficiently discriminative features to distinguish objects.

Viaarxiv icon

Learning Multimodal Contact-Rich Skills from Demonstrations Without Reward Engineering

Mar 01, 2021
Mythra V. Balakuntala, Upinder Kaur, Xin Ma, Juan Wachs, Richard M. Voyles

Figure 1 for Learning Multimodal Contact-Rich Skills from Demonstrations Without Reward Engineering
Figure 2 for Learning Multimodal Contact-Rich Skills from Demonstrations Without Reward Engineering
Figure 3 for Learning Multimodal Contact-Rich Skills from Demonstrations Without Reward Engineering
Figure 4 for Learning Multimodal Contact-Rich Skills from Demonstrations Without Reward Engineering

Everyday contact-rich tasks, such as peeling, cleaning, and writing, demand multimodal perception for effective and precise task execution. However, these present a novel challenge to robots as they lack the ability to combine these multimodal stimuli for performing contact-rich tasks. Learning-based methods have attempted to model multi-modal contact-rich tasks, but they often require extensive training examples and task-specific reward functions which limits their practicality and scope. Hence, we propose a generalizable model-free learning-from-demonstration framework for robots to learn contact-rich skills without explicit reward engineering. We present a novel multi-modal sensor data representation which improves the learning performance for contact-rich skills. We performed training and experiments using the real-life Sawyer robot for three everyday contact-rich skills -- cleaning, writing, and peeling. Notably, the framework achieves a success rate of 100% for the peeling and writing skill, and 80% for the cleaning skill. Hence, this skill learning framework can be extended for learning other physical manipulation skills.

* Submitted to IEEE ICRA 2021. This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible 
Viaarxiv icon

Pose-based Sign Language Recognition using GCN and BERT

Dec 01, 2020
Anirudh Tunga, Sai Vidyaranya Nuthalapati, Juan Wachs

Figure 1 for Pose-based Sign Language Recognition using GCN and BERT
Figure 2 for Pose-based Sign Language Recognition using GCN and BERT
Figure 3 for Pose-based Sign Language Recognition using GCN and BERT
Figure 4 for Pose-based Sign Language Recognition using GCN and BERT

Sign language recognition (SLR) plays a crucial role in bridging the communication gap between the hearing and vocally impaired community and the rest of the society. Word-level sign language recognition (WSLR) is the first important step towards understanding and interpreting sign language. However, recognizing signs from videos is a challenging task as the meaning of a word depends on a combination of subtle body motions, hand configurations, and other movements. Recent pose-based architectures for WSLR either model both the spatial and temporal dependencies among the poses in different frames simultaneously or only model the temporal information without fully utilizing the spatial information. We tackle the problem of WSLR using a novel pose-based approach, which captures spatial and temporal information separately and performs late fusion. Our proposed architecture explicitly captures the spatial interactions in the video using a Graph Convolutional Network (GCN). The temporal dependencies between the frames are captured using Bidirectional Encoder Representations from Transformers (BERT). Experimental results on WLASL, a standard word-level sign language recognition dataset show that our model significantly outperforms the state-of-the-art on pose-based methods by achieving an improvement in the prediction accuracy by up to 5%.

Viaarxiv icon

From the DESK (Dexterous Surgical Skill) to the Battlefield -- A Robotics Exploratory Study

Nov 30, 2020
Glebys T. Gonzalez, Upinder Kaur, Masudur Rahma, Vishnunandan Venkatesh, Natalia Sanchez, Gregory Hager, Yexiang Xue, Richard Voyles, Juan Wachs

Figure 1 for From the DESK (Dexterous Surgical Skill) to the Battlefield -- A Robotics Exploratory Study
Figure 2 for From the DESK (Dexterous Surgical Skill) to the Battlefield -- A Robotics Exploratory Study
Figure 3 for From the DESK (Dexterous Surgical Skill) to the Battlefield -- A Robotics Exploratory Study
Figure 4 for From the DESK (Dexterous Surgical Skill) to the Battlefield -- A Robotics Exploratory Study

Short response time is critical for future military medical operations in austere settings or remote areas. Such effective patient care at the point of injury can greatly benefit from the integration of semi-autonomous robotic systems. To achieve autonomy, robots would require massive libraries of maneuvers. While this is possible in controlled settings, obtaining surgical data in austere settings can be difficult. Hence, in this paper, we present the Dexterous Surgical Skill (DESK) database for knowledge transfer between robots. The peg transfer task was selected as it is one of 6 main tasks of laparoscopic training. Also, we provide a ML framework to evaluate novel transfer learning methodologies on this database. The collected DESK dataset comprises a set of surgical robotic skills using the four robotic platforms: Taurus II, simulated Taurus II, YuMi, and the da Vinci Research Kit. Then, we explored two different learning scenarios: no-transfer and domain-transfer. In the no-transfer scenario, the training and testing data were obtained from the same domain; whereas in the domain-transfer scenario, the training data is a blend of simulated and real robot data that is tested on a real robot. Using simulation data enhances the performance of the real robot where limited or no real data is available. The transfer model showed an accuracy of 81% for the YuMi robot when the ratio of real-to-simulated data was 22%-78%. For Taurus II and da Vinci robots, the model showed an accuracy of 97.5% and 93% respectively, training only with simulation data. Results indicate that simulation can be used to augment training data to enhance the performance of models in real scenarios. This shows the potential for future use of surgical data from the operating room in deployable surgical robots in remote areas.

* Published in MHSRS 2020  
* First 3 authors share equal contribution 
Viaarxiv icon

Pose Imitation Constraints for Collaborative Robots

Oct 12, 2020
Glebys Gonzalez, Juan Wachs

Figure 1 for Pose Imitation Constraints for Collaborative Robots
Figure 2 for Pose Imitation Constraints for Collaborative Robots
Figure 3 for Pose Imitation Constraints for Collaborative Robots
Figure 4 for Pose Imitation Constraints for Collaborative Robots

Achieving human-like motion in robots has been a fundamental goal in many areas of robotics research. Inverse kinematic (IK) solvers have been explored as a solution to provide kinematic structures with anthropomorphic movements. In particular, numeric solvers based on geometry, such as FABRIK, have shown potential for producing human-like motion at a low computational cost. Nevertheless, these methods have shown limitations when solving for robot kinematic constraints. This work proposes a framework inspired by FABRIK for human pose imitation in real-time. The goal is to mitigate the problems of the original algorithm while retaining the resulting humanlike fluidity and low cost. We first propose a human constraint model for pose imitation. Then, we present a pose imitation algorithm (PIC), and it's soft version (PICs) that can successfully imitate human poses using the proposed constraint system. PIC was tested on two collaborative robots (Baxter and YuMi). Fifty human demonstrations were collected for a bi-manual assembly and an incision task. Then, two performance metrics were obtained for both robots: pose accuracy with respect to the human and the percentage of environment occlusion/obstruction. The performance of PIC and PICs was compared against the numerical solver baseline (FABRIK). The proposed algorithms achieve a higher pose accuracy than FABRIK for both tasks (25%-FABRIK, 53%-PICs, 58%-PICs). In addition, PIC and it's soft version achieve a lower percentage of occlusion during incision (10%-FABRIK, 4%-PICs, 9%-PICs). These results indicate that the PIC method can reproduce human poses and achieve key desired effects of human imitation.

* 9 pages, 8 figures, 3 tables 
Viaarxiv icon

PICs for TECH: Pose Imitation Constraints (PICs) for TEaching Collaborative Heterogeneous robots (TECH)

Sep 23, 2020
Glebys Gonzalez, Juan Wachs

Figure 1 for PICs for TECH: Pose Imitation Constraints (PICs) for TEaching Collaborative Heterogeneous robots (TECH)
Figure 2 for PICs for TECH: Pose Imitation Constraints (PICs) for TEaching Collaborative Heterogeneous robots (TECH)
Figure 3 for PICs for TECH: Pose Imitation Constraints (PICs) for TEaching Collaborative Heterogeneous robots (TECH)
Figure 4 for PICs for TECH: Pose Imitation Constraints (PICs) for TEaching Collaborative Heterogeneous robots (TECH)

Achieving human-like motion in robots has been a fundamental goal in many areas of robotics research. Inverse kinematic (IK) solvers have been explored as a solution to provide kinematic structures with anthropomorphic movements. In particular, numeric solvers based on geometry, such as FABRIK, have shown potential for producing human-like motion at a low computational cost. Nevertheless, these methods have shown limitations when solving for robot kinematic constraints. This work proposes a framework inspired by FABRIK for human pose imitation in real-time. The goal is to mitigate the problems of the original algorithm while retaining the resulting human-like fluidity and low cost. We first propose a human constraint model for pose imitation. Then, we present a pose imitation algorithm (PIC), and its soft version (PICs) that can successfully imitate human poses using the proposed constraint system. PIC was tested on two collaborative robots (Baxter and YuMi). Fifty human demonstrations were collected for a bi-manual assembly and an incision task. Then, two performance metrics were obtained for both robots: pose accuracy with respect to the human and the percentage of environment occlusion/obstruction. The performance of PIC and PICs was compared against the numerical solver baseline (FABRIK). The proposed algorithms achieve a higher pose accuracy than FABRIK for both tasks (0.25- FABRIK, 0.53-PICs, 0.58-PICs). In addition, PIC and its soft version achieve a lower percentage of occlusion during incision (0.10-FABRIK, 0.04-PICs, 0.09-PICs) and a lower percentage of obstruction during assembly (0.09-FABRIK, 0.08-PICs, 0.07- PICs). These results show that PIC can both efficiently reproduce human poses and achieve key desired effects of human imitation.

* 9 pages, 7 figures, 3 tables 
Viaarxiv icon