Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Nassir Navab

Computer Aided Medical Procedures, Technische Universit Munchen, Germany, Johns Hopkins University, Baltimore MD, USA

Invisible Needle Detection in Ultrasound: Leveraging Mechanism-Induced Vibration

Mar 21, 2024

Chenyang Li, Dianye Huang, Angelos Karlas, Nassir Navab, Zhongliang Jiang

Figure 1 for Invisible Needle Detection in Ultrasound: Leveraging Mechanism-Induced Vibration

Figure 2 for Invisible Needle Detection in Ultrasound: Leveraging Mechanism-Induced Vibration

Figure 3 for Invisible Needle Detection in Ultrasound: Leveraging Mechanism-Induced Vibration

Figure 4 for Invisible Needle Detection in Ultrasound: Leveraging Mechanism-Induced Vibration

Abstract:In clinical applications that involve ultrasound-guided intervention, the visibility of the needle can be severely impeded due to steep insertion and strong distractors such as speckle noise and anatomical occlusion. To address this challenge, we propose VibNet, a learning-based framework tailored to enhance the robustness and accuracy of needle detection in ultrasound images, even when the target becomes invisible to the naked eye. Inspired by Eulerian Video Magnification techniques, we utilize an external step motor to induce low-amplitude periodic motion on the needle. These subtle vibrations offer the potential to generate robust frequency features for detecting the motion patterns around the needle. To robustly and precisely detect the needle leveraging these vibrations, VibNet integrates learning-based Short-Time-Fourier-Transform and Hough-Transform modules to achieve successive sub-goals, including motion feature extraction in the spatiotemporal space, frequency feature aggregation, and needle detection in the Hough space. Based on the results obtained on distinct ex vivo porcine and bovine tissue samples, the proposed algorithm exhibits superior detection performance with efficient computation and generalization capability.

Via

Access Paper or Ask Questions

CathFlow: Self-Supervised Segmentation of Catheters in Interventional Ultrasound Using Optical Flow and Transformers

Mar 21, 2024

Alex Ranne, Liming Kuang, Yordanka Velikova, Nassir Navab, Ferdinando Rodriguez y Baena

Abstract:In minimally invasive endovascular procedures, contrast-enhanced angiography remains the most robust imaging technique. However, it is at the expense of the patient and clinician's health due to prolonged radiation exposure. As an alternative, interventional ultrasound has notable benefits such as being radiation-free, fast to deploy, and having a small footprint in the operating room. Yet, ultrasound is hard to interpret, and highly prone to artifacts and noise. Additionally, interventional radiologists must undergo extensive training before they become qualified to diagnose and treat patients effectively, leading to a shortage of staff, and a lack of open-source datasets. In this work, we seek to address both problems by introducing a self-supervised deep learning architecture to segment catheters in longitudinal ultrasound images, without demanding any labeled data. The network architecture builds upon AiAReSeg, a segmentation transformer built with the Attention in Attention mechanism, and is capable of learning feature changes across time and space. To facilitate training, we used synthetic ultrasound data based on physics-driven catheter insertion simulations, and translated the data into a unique CT-Ultrasound common domain, CACTUSS, to improve the segmentation performance. We generated ground truth segmentation masks by computing the optical flow between adjacent frames using FlowNet2, and performed thresholding to obtain a binary map estimate. Finally, we validated our model on a test dataset, consisting of unseen synthetic data and images collected from silicon aorta phantoms, thus demonstrating its potential for applications to clinical data in the future.

* This work has been submitted to the IEEE for possible publication

Via

Access Paper or Ask Questions

FLex: Joint Pose and Dynamic Radiance Fields Optimization for Stereo Endoscopic Videos

Mar 18, 2024

Florian Philipp Stilz, Mert Asim Karaoglu, Felix Tristram, Nassir Navab, Benjamin Busam, Alexander Ladikos

Abstract:Reconstruction of endoscopic scenes is an important asset for various medical applications, from post-surgery analysis to educational training. Neural rendering has recently shown promising results in endoscopic reconstruction with deforming tissue. However, the setup has been restricted to a static endoscope, limited deformation, or required an external tracking device to retrieve camera pose information of the endoscopic camera. With FLex we adress the challenging setup of a moving endoscope within a highly dynamic environment of deforming tissue. We propose an implicit scene separation into multiple overlapping 4D neural radiance fields (NeRFs) and a progressive optimization scheme jointly optimizing for reconstruction and camera poses from scratch. This improves the ease-of-use and allows to scale reconstruction capabilities in time to process surgical videos of 5,000 frames and more; an improvement of more than ten times compared to the state of the art while being agnostic to external tracking information. Extensive evaluations on the StereoMIS dataset show that FLex significantly improves the quality of novel view synthesis while maintaining competitive pose accuracy.

Via

Access Paper or Ask Questions

MatchU: Matching Unseen Objects for 6D Pose Estimation from RGB-D Images

Mar 03, 2024

Junwen Huang, Hao Yu, Kuan-Ting Yu, Nassir Navab, Slobodan Ilic, Benjamin Busam

Figure 1 for MatchU: Matching Unseen Objects for 6D Pose Estimation from RGB-D Images

Figure 2 for MatchU: Matching Unseen Objects for 6D Pose Estimation from RGB-D Images

Figure 3 for MatchU: Matching Unseen Objects for 6D Pose Estimation from RGB-D Images

Figure 4 for MatchU: Matching Unseen Objects for 6D Pose Estimation from RGB-D Images

Abstract:Recent learning methods for object pose estimation require resource-intensive training for each individual object instance or category, hampering their scalability in real applications when confronted with previously unseen objects. In this paper, we propose MatchU, a Fuse-Describe-Match strategy for 6D pose estimation from RGB-D images. MatchU is a generic approach that fuses 2D texture and 3D geometric cues for 6D pose prediction of unseen objects. We rely on learning geometric 3D descriptors that are rotation-invariant by design. By encoding pose-agnostic geometry, the learned descriptors naturally generalize to unseen objects and capture symmetries. To tackle ambiguous associations using 3D geometry only, we fuse additional RGB information into our descriptor. This is achieved through a novel attention-based mechanism that fuses cross-modal information, together with a matching loss that leverages the latent space learned from RGB data to guide the descriptor learning process. Extensive experiments reveal the generalizability of both the RGB-D fusion strategy as well as the descriptor efficacy. Benefiting from the novel designs, MatchU surpasses all existing methods by a significant margin in terms of both accuracy and speed, even without the requirement of expensive re-training or rendering.

Via

Access Paper or Ask Questions

Physics-Encoded Graph Neural Networks for Deformation Prediction under Contact

Feb 05, 2024

Mahdi Saleh, Michael Sommersperger, Nassir Navab, Federico Tombari

Figure 1 for Physics-Encoded Graph Neural Networks for Deformation Prediction under Contact

Figure 2 for Physics-Encoded Graph Neural Networks for Deformation Prediction under Contact

Figure 3 for Physics-Encoded Graph Neural Networks for Deformation Prediction under Contact

Figure 4 for Physics-Encoded Graph Neural Networks for Deformation Prediction under Contact

Abstract:In robotics, it's crucial to understand object deformation during tactile interactions. A precise understanding of deformation can elevate robotic simulations and have broad implications across different industries. We introduce a method using Physics-Encoded Graph Neural Networks (GNNs) for such predictions. Similar to robotic grasping and manipulation scenarios, we focus on modeling the dynamics between a rigid mesh contacting a deformable mesh under external forces. Our approach represents both the soft body and the rigid body within graph structures, where nodes hold the physical states of the meshes. We also incorporate cross-attention mechanisms to capture the interplay between the objects. By jointly learning geometry and physics, our model reconstructs consistent and detailed deformations. We've made our code and dataset public to advance research in robotic simulation and grasping.

* Accepted at 2024 IEEE International Conference on Robotics and Automation (ICRA2024)

Via

Access Paper or Ask Questions

Machine Learning in Robotic Ultrasound Imaging: Challenges and Perspectives

Jan 04, 2024

Yuan Bi, Zhongliang Jiang, Felix Duelmer, Dianye Huang, Nassir Navab

Figure 1 for Machine Learning in Robotic Ultrasound Imaging: Challenges and Perspectives

Figure 2 for Machine Learning in Robotic Ultrasound Imaging: Challenges and Perspectives

Abstract:This article reviews the recent advances in intelligent robotic ultrasound (US) imaging systems. We commence by presenting the commonly employed robotic mechanisms and control techniques in robotic US imaging, along with their clinical applications. Subsequently, we focus on the deployment of machine learning techniques in the development of robotic sonographers, emphasizing crucial developments aimed at enhancing the intelligence of these systems. The methods for achieving autonomous action reasoning are categorized into two sets of approaches: those relying on implicit environmental data interpretation and those using explicit interpretation. Throughout this exploration, we also discuss practical challenges, including those related to the scarcity of medical data, the need for a deeper understanding of the physical aspects involved, and effective data representation approaches. Moreover, we conclude by highlighting the open problems in the field and analyzing different possible perspectives on how the community could move forward in this research area.

* Accepted by Annual Review of Control, Robotics, and Autonomous Systems

Via

Access Paper or Ask Questions

Robot-Assisted Deep Venous Thrombosis Ultrasound Examination using Virtual Fixture

Jan 04, 2024

Dianye Huang, Chenguang Yang, Mingchuan Zhou, Angelos Karlas, Nassir Navab, Zhongliang Jiang

Abstract:Deep Venous Thrombosis (DVT) is a common vascular disease with blood clots inside deep veins, which may block blood flow or even cause a life-threatening pulmonary embolism. A typical exam for DVT using ultrasound (US) imaging is by pressing the target vein until its lumen is fully compressed. However, the compression exam is highly operator-dependent. To alleviate intra- and inter-variations, we present a robotic US system with a novel hybrid force motion control scheme ensuring position and force tracking accuracy, and soft landing of the probe onto the target surface. In addition, a path-based virtual fixture is proposed to realize easy human-robot interaction for repeat compression operation at the lesion location. To ensure the biometric measurements obtained in different examinations are comparable, the 6D scanning path is determined in a coarse-to-fine manner using both an external RGBD camera and US images. The RGBD camera is first used to extract a rough scanning path on the object. Then, the segmented vascular lumen from US images are used to optimize the scanning path to ensure the visibility of the target object. To generate a continuous scan path for developing virtual fixtures, an arc-length based path fitting model considering both position and orientation is proposed. Finally, the whole system is evaluated on a human-like arm phantom with an uneven surface.

* Accepted Paper IEEE T-ASE

Via

Access Paper or Ask Questions

On Discprecncies between Perturbation Evaluations of Graph Neural Network Attributions

Jan 01, 2024

Razieh Rezaei, Alireza Dizaji, Ashkan Khakzar, Anees Kazi, Nassir Navab, Daniel Rueckert

Abstract:Neural networks are increasingly finding their way into the realm of graphs and modeling relationships between features. Concurrently graph neural network explanation approaches are being invented to uncover relationships between the nodes of the graphs. However, there is a disparity between the existing attribution methods, and it is unclear which attribution to trust. Therefore research has introduced evaluation experiments that assess them from different perspectives. In this work, we assess attribution methods from a perspective not previously explored in the graph domain: retraining. The core idea is to retrain the network on important (or not important) relationships as identified by the attributions and evaluate how networks can generalize based on these relationships. We reformulate the retraining framework to sidestep issues lurking in the previous formulation and propose guidelines for correct analysis. We run our analysis on four state-of-the-art GNN attribution methods and five synthetic and real-world graph classification datasets. The analysis reveals that attributions perform variably depending on the dataset and the network. Most importantly, we observe that the famous GNNExplainer performs similarly to an arbitrary designation of edge importance. The study concludes that the retraining evaluation cannot be used as a generalized benchmark and recommends it as a toolset to evaluate attributions on a specifically addressed network, dataset, and sparsity.

Via

Access Paper or Ask Questions

Deformable 3D Gaussian Splatting for Animatable Human Avatars

Dec 22, 2023

HyunJun Jung, Nikolas Brasch, Jifei Song, Eduardo Perez-Pellitero, Yiren Zhou, Zhihao Li, Nassir Navab, Benjamin Busam

Figure 1 for Deformable 3D Gaussian Splatting for Animatable Human Avatars

Figure 2 for Deformable 3D Gaussian Splatting for Animatable Human Avatars

Figure 3 for Deformable 3D Gaussian Splatting for Animatable Human Avatars

Figure 4 for Deformable 3D Gaussian Splatting for Animatable Human Avatars

Abstract:Recent advances in neural radiance fields enable novel view synthesis of photo-realistic images in dynamic settings, which can be applied to scenarios with human animation. Commonly used implicit backbones to establish accurate models, however, require many input views and additional annotations such as human masks, UV maps and depth maps. In this work, we propose ParDy-Human (Parameterized Dynamic Human Avatar), a fully explicit approach to construct a digital avatar from as little as a single monocular sequence. ParDy-Human introduces parameter-driven dynamics into 3D Gaussian Splatting where 3D Gaussians are deformed by a human pose model to animate the avatar. Our method is composed of two parts: A first module that deforms canonical 3D Gaussians according to SMPL vertices and a consecutive module that further takes their designed joint encodings and predicts per Gaussian deformations to deal with dynamics beyond SMPL vertex deformations. Images are then synthesized by a rasterizer. ParDy-Human constitutes an explicit model for realistic dynamic human avatars which requires significantly fewer training views and images. Our avatars learning is free of additional annotations such as masks and can be trained with variable backgrounds while inferring full-resolution images efficiently even on consumer hardware. We provide experimental evidence to show that ParDy-Human outperforms state-of-the-art methods on ZJU-MoCap and THUman4.0 datasets both quantitatively and visually.

Via

Access Paper or Ask Questions

Advancing Surgical VQA with Scene Graph Knowledge

Dec 15, 2023

Kun Yuan, Manasi Kattel, Joel L. Lavanchy, Nassir Navab, Vinkle Srivastava, Nicolas Padoy

Abstract:Modern operating room is becoming increasingly complex, requiring innovative intra-operative support systems. While the focus of surgical data science has largely been on video analysis, integrating surgical computer vision with language capabilities is emerging as a necessity. Our work aims to advance Visual Question Answering (VQA) in the surgical context with scene graph knowledge, addressing two main challenges in the current surgical VQA systems: removing question-condition bias in the surgical VQA dataset and incorporating scene-aware reasoning in the surgical VQA model design. First, we propose a Surgical Scene Graph-based dataset, SSG-QA, generated by employing segmentation and detection models on publicly available datasets. We build surgical scene graphs using spatial and action information of instruments and anatomies. These graphs are fed into a question engine, generating diverse QA pairs. Our SSG-QA dataset provides a more complex, diverse, geometrically grounded, unbiased, and surgical action-oriented dataset compared to existing surgical VQA datasets. We then propose SSG-QA-Net, a novel surgical VQA model incorporating a lightweight Scene-embedded Interaction Module (SIM), which integrates geometric scene knowledge in the VQA model design by employing cross-attention between the textual and the scene features. Our comprehensive analysis of the SSG-QA dataset shows that SSG-QA-Net outperforms existing methods across different question types and complexities. We highlight that the primary limitation in the current surgical VQA systems is the lack of scene knowledge to answer complex queries. We present a novel surgical VQA dataset and model and show that results can be significantly improved by incorporating geometric scene features in the VQA model design. The source code and the dataset will be made publicly available at: https://github.com/CAMMA-public/SSG-QA

* Conditional Accept by IPCAI 2024

Via

Access Paper or Ask Questions