Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Sebastian Bodenstedt

Institute for Anthropomatics and Robotics, Karlsruhe Institute of Technology, Karlsruhe

Prediction of laparoscopic procedure duration using unlabeled, multimodal sensor data

Nov 08, 2018

Sebastian Bodenstedt, Martin Wagner, Lars Mündermann, Hannes Kenngott, Beat Müller-Stich, Sören Torge Mees, Jürgen Weitz, Stefanie Speidel

Figure 1 for Prediction of laparoscopic procedure duration using unlabeled, multimodal sensor data

Figure 2 for Prediction of laparoscopic procedure duration using unlabeled, multimodal sensor data

Figure 3 for Prediction of laparoscopic procedure duration using unlabeled, multimodal sensor data

Figure 4 for Prediction of laparoscopic procedure duration using unlabeled, multimodal sensor data

Abstract:The course of surgical procedures is often unpredictable, making it difficult to estimate the duration of procedures beforehand. This uncertainty makes scheduling surgical procedures a difficult task. A context-aware method that analyses the workflow of an intervention online and automatically predicts the remaining duration would alleviate these problems. As basis for such an estimate, information regarding the current state of the intervention is a requirement. Today, the operating room contains a diverse range of sensors. During laparoscopic interventions, the endoscopic video stream is an ideal source of such information. Extracting quantitative information from the video is challenging though, due to its high dimensionality. Other surgical devices (e.g. insufflator, lights, etc.) provide data streams which are, in contrast to the video stream, more compact and easier to quantify. Though whether such streams offer sufficient information for estimating the duration of surgery is uncertain. In this paper, we propose and compare methods, based on convolutional neural networks, for continuously predicting the duration of laparoscopic interventions based on unlabeled data, such as from endoscopic image and surgical device streams. The methods are evaluated on 80 recorded laparoscopic interventions of various types, for which surgical device data and the endoscopic video streams are available. Here the combined method performs best with an overall average error of 37% and an average halftime error of approximately 28%.

Via

Access Paper or Ask Questions

Active Learning using Deep Bayesian Networks for Surgical Workflow Analysis

Nov 08, 2018

Sebastian Bodenstedt, Dominik Rivoir, Alexander Jenke, Martin Wagner, Sören Torge Mees, Jürgen Weitz, Stefanie Speidel

Figure 1 for Active Learning using Deep Bayesian Networks for Surgical Workflow Analysis

Figure 2 for Active Learning using Deep Bayesian Networks for Surgical Workflow Analysis

Figure 3 for Active Learning using Deep Bayesian Networks for Surgical Workflow Analysis

Figure 4 for Active Learning using Deep Bayesian Networks for Surgical Workflow Analysis

Abstract:For many applications in the field of computer assisted surgery, such as providing the position of a tumor, specifying the most probable tool required next by the surgeon or determining the remaining duration of surgery, methods for surgical workflow analysis are a prerequisite. Often machine learning based approaches serve as basis for surgical workflow analysis. In general machine learning algorithms, such as convolutional neural networks (CNN), require large amounts of labeled data. While data is often available in abundance, many tasks in surgical workflow analysis need data annotated by domain experts, making it difficult to obtain a sufficient amount of annotations. The aim of using active learning to train a machine learning model is to reduce the annotation effort. Active learning methods determine which unlabeled data points would provide the most information according to some metric, such as prediction uncertainty. Experts will then be asked to only annotate these data points. The model is then retrained with the new data and used to select further data for annotation. Recently, active learning has been applied to CNN by means of Deep Bayesian Networks (DBN). These networks make it possible to assign uncertainties to predictions. In this paper, we present a DBN-based active learning approach adapted for image-based surgical workflow analysis task. Furthermore, by using a recurrent architecture, we extend this network to video-based surgical workflow analysis. We evaluate these approaches on the Cholec80 dataset by performing instrument presence detection and surgical phase segmentation. Here we are able to show that using a DBN-based active learning approach for selecting what data points to annotate next outperforms a baseline based on randomly selecting data points.

Via

Access Paper or Ask Questions

Temporal coherence-based self-supervised learning for laparoscopic workflow analysis

Sep 07, 2018

Isabel Funke, Alexander Jenke, Sören Torge Mees, Jürgen Weitz, Stefanie Speidel, Sebastian Bodenstedt

Figure 1 for Temporal coherence-based self-supervised learning for laparoscopic workflow analysis

Figure 2 for Temporal coherence-based self-supervised learning for laparoscopic workflow analysis

Figure 3 for Temporal coherence-based self-supervised learning for laparoscopic workflow analysis

Abstract:In order to provide the right type of assistance at the right time, computer-assisted surgery systems need context awareness. To achieve this, methods for surgical workflow analysis are crucial. Currently, convolutional neural networks provide the best performance for video-based workflow analysis tasks. For training such networks, large amounts of annotated data are necessary. However, collecting a sufficient amount of data is often costly, time-consuming, and not always feasible. In this paper, we address this problem by presenting and comparing different approaches for self-supervised pretraining of neural networks on unlabeled laparoscopic videos using temporal coherence. We evaluate our pretrained networks on Cholec80, a publicly available dataset for surgical phase segmentation, on which a maximum F1 score of 84.6 was reached. Furthermore, we were able to achieve an increase of the F1 score of up to 10 points when compared to a non-pretrained neural network.

* CARE 2018, CLIP 2018, OR 2.0 2018, ISIC 2018. Lecture Notes in Computer Science, vol 11041 (2018) 85-93
* Accepted at the Workshop on Context-Aware Operating Theaters (OR 2.0), a MICCAI satellite event

Via

Access Paper or Ask Questions

Real-time image-based instrument classification for laparoscopic surgery

Aug 01, 2018

Sebastian Bodenstedt, Antonia Ohnemus, Darko Katic, Anna-Laura Wekerle, Martin Wagner, Hannes Kenngott, Beat Müller-Stich, Rüdiger Dillmann, Stefanie Speidel

Figure 1 for Real-time image-based instrument classification for laparoscopic surgery

Figure 2 for Real-time image-based instrument classification for laparoscopic surgery

Figure 3 for Real-time image-based instrument classification for laparoscopic surgery

Figure 4 for Real-time image-based instrument classification for laparoscopic surgery

Abstract:During laparoscopic surgery, context-aware assistance systems aim to alleviate some of the difficulties the surgeon faces. To ensure that the right information is provided at the right time, the current phase of the intervention has to be known. Real-time locating and classification the surgical tools currently in use are key components of both an activity-based phase recognition and assistance generation. In this paper, we present an image-based approach that detects and classifies tools during laparoscopic interventions in real-time. First, potential instrument bounding boxes are detected using a pixel-wise random forest segmentation. Each of these bounding boxes is then classified using a cascade of random forest. For this, multiple features, such as histograms over hue and saturation, gradients and SURF feature, are extracted from each detected bounding box. We evaluated our approach on five different videos from two different types of procedures. We distinguished between the four most common classes of instruments (LigaSure, atraumatic grasper, aspirator, clip applier) and background. Our method succesfully located up to 86% of all instruments respectively. On manually provided bounding boxes, we achieve a instrument type recognition rate of up to 58% and on automatically detected bounding boxes up to 49%. To our knowledge, this is the first approach that allows an image-based classification of surgical tools in a laparoscopic setting in real-time.

* Modeling and Monitoring of Computer Assisted Interventions (M2CAI) (2015)
* Workshop paper accepted and presented at Modeling and Monitoring of Computer Assisted Interventions (M2CAI) (2015)

Via

Access Paper or Ask Questions

Comparative evaluation of instrument segmentation and tracking methods in minimally invasive surgery

May 07, 2018

Sebastian Bodenstedt, Max Allan, Anthony Agustinos, Xiaofei Du, Luis Garcia-Peraza-Herrera, Hannes Kenngott, Thomas Kurmann, Beat Müller-Stich, Sebastien Ourselin, Daniil Pakhomov(+10 more)

Figure 1 for Comparative evaluation of instrument segmentation and tracking methods in minimally invasive surgery

Figure 2 for Comparative evaluation of instrument segmentation and tracking methods in minimally invasive surgery

Figure 3 for Comparative evaluation of instrument segmentation and tracking methods in minimally invasive surgery

Figure 4 for Comparative evaluation of instrument segmentation and tracking methods in minimally invasive surgery

Abstract:Intraoperative segmentation and tracking of minimally invasive instruments is a prerequisite for computer- and robotic-assisted surgery. Since additional hardware like tracking systems or the robot encoders are cumbersome and lack accuracy, surgical vision is evolving as promising techniques to segment and track the instruments using only the endoscopic images. However, what is missing so far are common image data sets for consistent evaluation and benchmarking of algorithms against each other. The paper presents a comparative validation study of different vision-based methods for instrument segmentation and tracking in the context of robotic as well as conventional laparoscopic surgery. The contribution of the paper is twofold: we introduce a comprehensive validation data set that was provided to the study participants and present the results of the comparative validation study. Based on the results of the validation study, we arrive at the conclusion that modern deep learning approaches outperform other methods in instrument segmentation tasks, but the results are still not perfect. Furthermore, we show that merging results from different methods actually significantly increases accuracy in comparison to the best stand-alone method. On the other hand, the results of the instrument tracking task show that this is still an open challenge, especially during challenging scenarios in conventional laparoscopic surgery.

Via

Access Paper or Ask Questions

Exploiting the potential of unlabeled endoscopic video data with self-supervised learning

Jan 31, 2018

Tobias Ross, David Zimmerer, Anant Vemuri, Fabian Isensee, Manuel Wiesenfarth, Sebastian Bodenstedt, Fabian Both, Philip Kessler, Martin Wagner, Beat Müller(+5 more)

Figure 1 for Exploiting the potential of unlabeled endoscopic video data with self-supervised learning

Figure 2 for Exploiting the potential of unlabeled endoscopic video data with self-supervised learning

Figure 3 for Exploiting the potential of unlabeled endoscopic video data with self-supervised learning

Figure 4 for Exploiting the potential of unlabeled endoscopic video data with self-supervised learning

Abstract:Surgical data science is a new research field that aims to observe all aspects of the patient treatment process in order to provide the right assistance at the right time. Due to the breakthrough successes of deep learning-based solutions for automatic image annotation, the availability of reference annotations for algorithm training is becoming a major bottleneck in the field. The purpose of this paper was to investigate the concept of self-supervised learning to address this issue. Our approach is guided by the hypothesis that unlabeled video data can be used to learn a representation of the target domain that boosts the performance of state-of-the-art machine learning algorithms when used for pre-training. Core of the method is an auxiliary task based on raw endoscopic video data of the target domain that is used to initialize the convolutional neural network (CNN) for the target task. In this paper, we propose the re-colorization of medical images with a generative adversarial network (GAN)-based architecture as auxiliary task. A variant of the method involves a second pre-training step based on labeled data for the target task from a related domain. We validate both variants using medical instrument segmentation as target task. The proposed approach can be used to radically reduce the manual annotation effort involved in training CNNs. Compared to the baseline approach of generating annotated data from scratch, our method decreases exploratively the number of labeled images by up to 75% without sacrificing performance. Our method also outperforms alternative methods for CNN pre-training, such as pre-training on publicly available non-medical or medical data using the target task (in this instance: segmentation). As it makes efficient use of available (non-)public and (un-)labeled data, the approach has the potential to become a valuable tool for CNN (pre-)training.

Via

Access Paper or Ask Questions

Unsupervised temporal context learning using convolutional neural networks for laparoscopic workflow analysis

Feb 13, 2017

Sebastian Bodenstedt, Martin Wagner, Darko Katić, Patrick Mietkowski, Benjamin Mayer, Hannes Kenngott, Beat Müller-Stich, Rüdiger Dillmann, Stefanie Speidel

Figure 1 for Unsupervised temporal context learning using convolutional neural networks for laparoscopic workflow analysis

Figure 2 for Unsupervised temporal context learning using convolutional neural networks for laparoscopic workflow analysis

Figure 3 for Unsupervised temporal context learning using convolutional neural networks for laparoscopic workflow analysis

Figure 4 for Unsupervised temporal context learning using convolutional neural networks for laparoscopic workflow analysis

Abstract:Computer-assisted surgery (CAS) aims to provide the surgeon with the right type of assistance at the right moment. Such assistance systems are especially relevant in laparoscopic surgery, where CAS can alleviate some of the drawbacks that surgeons incur. For many assistance functions, e.g. displaying the location of a tumor at the appropriate time or suggesting what instruments to prepare next, analyzing the surgical workflow is a prerequisite. Since laparoscopic interventions are performed via endoscope, the video signal is an obvious sensor modality to rely on for workflow analysis. Image-based workflow analysis tasks in laparoscopy, such as phase recognition, skill assessment, video indexing or automatic annotation, require a temporal distinction between video frames. Generally computer vision based methods that generalize from previously seen data are used. For training such methods, large amounts of annotated data are necessary. Annotating surgical data requires expert knowledge, therefore collecting a sufficient amount of data is difficult, time-consuming and not always feasible. In this paper, we address this problem by presenting an unsupervised method for training a convolutional neural network (CNN) to differentiate between laparoscopic video frames on a temporal basis. We extract video frames at regular intervals from 324 unlabeled laparoscopic interventions, resulting in a dataset of approximately 2.2 million images. From this dataset, we extract image pairs from the same video and train a CNN to determine their temporal order. To solve this problem, the CNN has to extract features that are relevant for comprehending laparoscopic workflow. Furthermore, we demonstrate that such a CNN can be adapted for surgical workflow segmentation. We performed image-based workflow segmentation on a publicly available dataset of 7 cholecystectomies and 9 colorectal interventions.

Via

Access Paper or Ask Questions