Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Nassir Navab

Computer Aided Medical Procedures, Technische Universit Munchen, Germany, Johns Hopkins University, Baltimore MD, USA

Active Learning Enhances Classification of Histopathology Whole Slide Images with Attention-based Multiple Instance Learning

Mar 02, 2023

Ario Sadafi, Nassir Navab, Carsten Marr

Figure 1 for Active Learning Enhances Classification of Histopathology Whole Slide Images with Attention-based Multiple Instance Learning

Figure 2 for Active Learning Enhances Classification of Histopathology Whole Slide Images with Attention-based Multiple Instance Learning

Figure 3 for Active Learning Enhances Classification of Histopathology Whole Slide Images with Attention-based Multiple Instance Learning

Figure 4 for Active Learning Enhances Classification of Histopathology Whole Slide Images with Attention-based Multiple Instance Learning

Abstract:In many histopathology tasks, sample classification depends on morphological details in tissue or single cells that are only visible at the highest magnification. For a pathologist, this implies tedious zooming in and out, while for a computational decision support algorithm, it leads to the analysis of a huge number of small image patches per whole slide image (WSI). Attention-based multiple instance learning (MIL), where attention estimation is learned in a weakly supervised manner, has been successfully applied in computational histopathology, but it is challenged by large numbers of irrelevant patches, reducing its accuracy. Here, we present an active learning approach to the problem. Querying the expert to annotate regions of interest in a WSI guides the formation of high-attention regions for MIL. We train an attention-based MIL and calculate a confidence metric for every image in the dataset to select the most uncertain WSIs for expert annotation. We test our approach on the CAMELYON17 dataset classifying metastatic lymph node sections in breast cancer. With a novel attention guiding loss, this leads to an accuracy boost of the trained models with few regions annotated for each class. Active learning thus improves WSIs classification accuracy, leads to faster and more robust convergence, and speeds up the annotation process. It may in the future serve as an important contribution to train MIL models in the clinically relevant context of cancer classification in histopathology.

* Accepted for publication at the 2023 IEEE International Symposium on Biomedical Imaging (ISBI 2023)

Via

Access Paper or Ask Questions

On the Importance of Patient Acceptance for Medical Robotic Imaging

Feb 13, 2023

Christine Eilers, Rob van Kemenade, Benjamin Busam, Nassir Navab

Abstract:Purpose: Mutual acceptance is required for any human-to-human interaction. Therefore, one would assume that this also holds for robot-patient interactions. However, the medical robotic imaging field lacks research in the area of acceptance. This work, therefore, aims at analyzing the influence of robot-patient interactions on acceptance in an exemplary medical robotic imaging system. Methods: We designed an interactive human-robot scenario, including auditive and gestural cues, and compared this pipeline to a non-interactive scenario. Both scenarios were evaluated through a questionnaire to measure acceptance. Heart rate monitoring was also used to measure stress. The impact of the interaction was quantified in the use case of robotic ultrasound scanning of the neck. Results: We conducted the first user study on patient acceptance of robotic ultrasound. Results show that verbal interactions impacts trust more than gestural ones. Furthermore, through interaction, the robot is perceived to be friendlier. The heart rate data indicates that robot-patient interaction could reduce stress. Conclusion: Robot-patient interactions are crucial for improving acceptance in medical robotic imaging systems. While verbal interaction is most important, the preferred interaction type and content are participant-dependent. Heart rate values indicate that such interactions can also reduce stress. Overall, this initial work showed that interactions improve patient acceptance in medical robotic imaging, and other medical robot-patient systems can benefit from the design proposals to enhance acceptance in interactive scenarios.

* Under submission for IPCAI/IJCARS 2023

Via

Access Paper or Ask Questions

CholecTriplet2022: Show me a tool and tell me the triplet -- an endoscopic vision challenge for surgical action triplet detection

Feb 13, 2023

Chinedu Innocent Nwoye, Tong Yu, Saurav Sharma, Aditya Murali, Deepak Alapatt, Armine Vardazaryan, Kun Yuan, Jonas Hajek, Wolfgang Reiter, Amine Yamlahi(+39 more)

Figure 1 for CholecTriplet2022: Show me a tool and tell me the triplet -- an endoscopic vision challenge for surgical action triplet detection

Figure 2 for CholecTriplet2022: Show me a tool and tell me the triplet -- an endoscopic vision challenge for surgical action triplet detection

Figure 3 for CholecTriplet2022: Show me a tool and tell me the triplet -- an endoscopic vision challenge for surgical action triplet detection

Figure 4 for CholecTriplet2022: Show me a tool and tell me the triplet -- an endoscopic vision challenge for surgical action triplet detection

Abstract:Formalizing surgical activities as triplets of the used instruments, actions performed, and target anatomies is becoming a gold standard approach for surgical activity modeling. The benefit is that this formalization helps to obtain a more detailed understanding of tool-tissue interaction which can be used to develop better Artificial Intelligence assistance for image-guided surgery. Earlier efforts and the CholecTriplet challenge introduced in 2021 have put together techniques aimed at recognizing these triplets from surgical footage. Estimating also the spatial locations of the triplets would offer a more precise intraoperative context-aware decision support for computer-assisted intervention. This paper presents the CholecTriplet2022 challenge, which extends surgical action triplet modeling from recognition to detection. It includes weakly-supervised bounding box localization of every visible surgical instrument (or tool), as the key actors, and the modeling of each tool-activity in the form of <instrument, verb, target> triplet. The paper describes a baseline method and 10 new deep learning algorithms presented at the challenge to solve the task. It also provides thorough methodological comparisons of the methods, an in-depth analysis of the obtained results, their significance, and useful insights for future research directions and applications in surgery.

* MICCAI EndoVis CholecTriplet2022 challenge report. Submitted to journal of Medical Image Analysis. 22 pages, 14 figures, 6 tables

Via

Access Paper or Ask Questions

Investigating Pulse-Echo Sound Speed Estimation in Breast Ultrasound with Deep Learning

Feb 06, 2023

Walter A. Simson, Magdalini Paschali, Vasiliki Sideri-Lampretsa, Nassir Navab, Jeremy J. Dahl

Figure 1 for Investigating Pulse-Echo Sound Speed Estimation in Breast Ultrasound with Deep Learning

Figure 2 for Investigating Pulse-Echo Sound Speed Estimation in Breast Ultrasound with Deep Learning

Figure 3 for Investigating Pulse-Echo Sound Speed Estimation in Breast Ultrasound with Deep Learning

Figure 4 for Investigating Pulse-Echo Sound Speed Estimation in Breast Ultrasound with Deep Learning

Abstract:Ultrasound is an adjunct tool to mammography that can quickly and safely aid physicians with diagnosing breast abnormalities. Clinical ultrasound often assumes a constant sound speed to form B-mode images for diagnosis. However, the various types of breast tissue, such as glandular, fat, and lesions, differ in sound speed. These differences can degrade the image reconstruction process. Alternatively, sound speed can be a powerful tool for identifying disease. To this end, we propose a deep-learning approach for sound speed estimation from in-phase and quadrature ultrasound signals. First, we develop a large-scale simulated ultrasound dataset that generates quasi-realistic breast tissue by modeling breast gland, skin, and lesions with varying echogenicity and sound speed. We developed a fully convolutional neural network architecture trained on a simulated dataset to produce an estimated sound speed map from inputting three complex-value in-phase and quadrature ultrasound images formed from plane-wave transmissions at separate angles. Furthermore, thermal noise augmentation is used during model optimization to enhance generalizability to real ultrasound data. We evaluate the model on simulated, phantom, and in-vivo breast ultrasound data, demonstrating its ability to accurately estimate sound speeds consistent with previously reported values in the literature. Our simulated dataset and model will be publicly available to provide a step towards accurate and generalizable sound speed estimation for pulse-echo ultrasound imaging.

Via

Access Paper or Ask Questions

KST-Mixer: Kinematic Spatio-Temporal Data Mixer For Colon Shape Estimation

Feb 02, 2023

Masahiro Oda, Kazuhiro Furukawa, Nassir Navab, Kensaku Mori

Abstract:We propose a spatio-temporal mixing kinematic data estimation method to estimate the shape of the colon with deformations caused by colonoscope insertion. Endoscope tracking or a navigation system that navigates physicians to target positions is needed to reduce such complications as organ perforations. Although many previous methods focused to track bronchoscopes and surgical endoscopes, few number of colonoscope tracking methods were proposed. This is because the colon largely deforms during colonoscope insertion. The deformation causes significant tracking errors. Colon deformation should be taken into account in the tracking process. We propose a colon shape estimation method using a Kinematic Spatio-Temporal data Mixer (KST-Mixer) that can be used during colonoscope insertions to the colon. Kinematic data of a colonoscope and the colon, including positions and directions of their centerlines, are obtained using electromagnetic and depth sensors. The proposed method separates the data into sub-groups along the spatial and temporal axes. The KST-Mixer extracts kinematic features and mix them along the spatial and temporal axes multiple times. We evaluated colon shape estimation accuracies in phantom studies. The proposed method achieved 11.92 mm mean Euclidean distance error, the smallest of the previous methods. Statistical analysis indicated that the proposed method significantly reduced the error compared to the previous methods.

* Computer Methods in Biomechanics and Biomedical Engineering: Imaging & Visualization, 2023
* Accepted paper as an oral presentation at Joint MICCAI workshop 2022, AE-CAI/CARE/OR2.0. Received the Outstanding Paper Award

Via

Access Paper or Ask Questions

Lidar Upsampling with Sliced Wasserstein Distance

Jan 31, 2023

Artem Savkin, Yida Wang, Sebastian Wirkert, Nassir Navab, Federico Tombar

Figure 1 for Lidar Upsampling with Sliced Wasserstein Distance

Figure 2 for Lidar Upsampling with Sliced Wasserstein Distance

Figure 3 for Lidar Upsampling with Sliced Wasserstein Distance

Figure 4 for Lidar Upsampling with Sliced Wasserstein Distance

Abstract:Lidar became an important component of the perception systems in autonomous driving. But challenges of training data acquisition and annotation made emphasized the role of the sensor to sensor domain adaptation. In this work, we address the problem of lidar upsampling. Learning on lidar point clouds is rather a challenging task due to their irregular and sparse structure. Here we propose a method for lidar point cloud upsampling which can reconstruct fine-grained lidar scan patterns. The key idea is to utilize edge-aware dense convolutions for both feature extraction and feature expansion. Additionally applying a more accurate Sliced Wasserstein Distance facilitates learning of the fine lidar sweep structures. This in turn enables our method to employ a one-stage upsampling paradigm without the need for coarse and fine reconstruction. We conduct several experiments to evaluate our method and demonstrate that it provides better upsampling.

* in IEEE Robotics and Automation Letters, vol. 8, no. 1, pp. 392-399, Jan. 2023

Via

Access Paper or Ask Questions

Ultra-NeRF: Neural Radiance Fields for Ultrasound Imaging

Jan 25, 2023

Magdalena Wysocki, Mohammad Farid Azampour, Christine Eilers, Benjamin Busam, Mehrdad Salehi, Nassir Navab

Figure 1 for Ultra-NeRF: Neural Radiance Fields for Ultrasound Imaging

Figure 2 for Ultra-NeRF: Neural Radiance Fields for Ultrasound Imaging

Figure 3 for Ultra-NeRF: Neural Radiance Fields for Ultrasound Imaging

Figure 4 for Ultra-NeRF: Neural Radiance Fields for Ultrasound Imaging

Abstract:We present a physics-enhanced implicit neural representation (INR) for ultrasound (US) imaging that learns tissue properties from overlapping US sweeps. Our proposed method leverages a ray-tracing-based neural rendering for novel view US synthesis. Recent publications demonstrated that INR models could encode a representation of a three-dimensional scene from a set of two-dimensional US frames. However, these models fail to consider the view-dependent changes in appearance and geometry intrinsic to US imaging. In our work, we discuss direction-dependent changes in the scene and show that a physics-inspired rendering improves the fidelity of US image synthesis. In particular, we demonstrate experimentally that our proposed method generates geometrically accurate B-mode images for regions with ambiguous representation owing to view-dependent differences of the US images. We conduct our experiments using simulated B-mode US sweeps of the liver and acquired US sweeps of a spine phantom tracked with a robotic arm. The experiments corroborate that our method generates US frames that enable consistent volume compounding from previously unseen views. To the best of our knowledge, the presented work is the first to address view-dependent US image synthesis using INR.

* submitted to MIDL

Via

Access Paper or Ask Questions

Robotic Navigation Autonomy for Subretinal Injection via Intelligent Real-Time Virtual iOCT Volume Slicing

Jan 17, 2023

Shervin Dehghani, Michael Sommersperger, Peiyao Zhang, Alejandro Martin-Gomez, Benjamin Busam, Peter Gehlbach, Nassir Navab, M. Ali Nasseri, Iulian Iordachita

Abstract:In the last decade, various robotic platforms have been introduced that could support delicate retinal surgeries. Concurrently, to provide semantic understanding of the surgical area, recent advances have enabled microscope-integrated intraoperative Optical Coherent Tomography (iOCT) with high-resolution 3D imaging at near video rate. The combination of robotics and semantic understanding enables task autonomy in robotic retinal surgery, such as for subretinal injection. This procedure requires precise needle insertion for best treatment outcomes. However, merging robotic systems with iOCT introduces new challenges. These include, but are not limited to high demands on data processing rates and dynamic registration of these systems during the procedure. In this work, we propose a framework for autonomous robotic navigation for subretinal injection, based on intelligent real-time processing of iOCT volumes. Our method consists of an instrument pose estimation method, an online registration between the robotic and the iOCT system, and trajectory planning tailored for navigation to an injection target. We also introduce intelligent virtual B-scans, a volume slicing approach for rapid instrument pose estimation, which is enabled by Convolutional Neural Networks (CNNs). Our experiments on ex-vivo porcine eyes demonstrate the precision and repeatability of the method. Finally, we discuss identified challenges in this work and suggest potential solutions to further the development of such systems.

Via

Access Paper or Ask Questions

TexPose: Neural Texture Learning for Self-Supervised 6D Object Pose Estimation

Dec 25, 2022

Hanzhi Chen, Fabian Manhardt, Nassir Navab, Benjamin Busam

Abstract:In this paper, we introduce neural texture learning for 6D object pose estimation from synthetic data and a few unlabelled real images. Our major contribution is a novel learning scheme which removes the drawbacks of previous works, namely the strong dependency on co-modalities or additional refinement. These have been previously necessary to provide training signals for convergence. We formulate such a scheme as two sub-optimisation problems on texture learning and pose learning. We separately learn to predict realistic texture of objects from real image collections and learn pose estimation from pixel-perfect synthetic data. Combining these two capabilities allows then to synthesise photorealistic novel views to supervise the pose estimator with accurate geometry. To alleviate pose noise and segmentation imperfection present during the texture learning phase, we propose a surfel-based adversarial training loss together with texture regularisation from synthetic data. We demonstrate that the proposed approach significantly outperforms the recent state-of-the-art methods without ground-truth pose annotations and demonstrates substantial generalisation improvements towards unseen scenes. Remarkably, our scheme improves the adopted pose estimators substantially even when initialised with much inferior performance.

Via

Access Paper or Ask Questions

SupeRGB-D: Zero-shot Instance Segmentation in Cluttered Indoor Environments

Dec 22, 2022

Evin Pınar Örnek, Aravindhan K Krishnan, Shreekant Gayaka, Cheng-Hao Kuo, Arnie Sen, Nassir Navab, Federico Tombari

Figure 1 for SupeRGB-D: Zero-shot Instance Segmentation in Cluttered Indoor Environments

Figure 2 for SupeRGB-D: Zero-shot Instance Segmentation in Cluttered Indoor Environments

Figure 3 for SupeRGB-D: Zero-shot Instance Segmentation in Cluttered Indoor Environments

Figure 4 for SupeRGB-D: Zero-shot Instance Segmentation in Cluttered Indoor Environments

Abstract:Object instance segmentation is a key challenge for indoor robots navigating cluttered environments with many small objects. Limitations in 3D sensing capabilities often make it difficult to detect every possible object. While deep learning approaches may be effective for this problem, manually annotating 3D data for supervised learning is time-consuming. In this work, we explore zero-shot instance segmentation (ZSIS) from RGB-D data to identify unseen objects in a semantic category-agnostic manner. We introduce a zero-shot split for Tabletop Objects Dataset (TOD-Z) to enable this study and present a method that uses annotated objects to learn the ``objectness'' of pixels and generalize to unseen object categories in cluttered indoor environments. Our method, SupeRGB-D, groups pixels into small patches based on geometric cues and learns to merge the patches in a deep agglomerative clustering fashion. SupeRGB-D outperforms existing baselines on unseen objects while achieving similar performance on seen objects. Additionally, it is extremely lightweight (0.4 MB memory requirement) and suitable for mobile and robotic applications. The dataset split and code will be made publicly available upon acceptance.

Via

Access Paper or Ask Questions