Alert button
Picture for Xiaoyang Zou

Xiaoyang Zou

Alert button

CholecTriplet2022: Show me a tool and tell me the triplet -- an endoscopic vision challenge for surgical action triplet detection

Feb 13, 2023
Chinedu Innocent Nwoye, Tong Yu, Saurav Sharma, Aditya Murali, Deepak Alapatt, Armine Vardazaryan, Kun Yuan, Jonas Hajek, Wolfgang Reiter, Amine Yamlahi, Finn-Henri Smidt, Xiaoyang Zou, Guoyan Zheng, Bruno Oliveira, Helena R. Torres, Satoshi Kondo, Satoshi Kasai, Felix Holm, Ege Özsoy, Shuangchun Gui, Han Li, Sista Raviteja, Rachana Sathish, Pranav Poudel, Binod Bhattarai, Ziheng Wang, Guo Rui, Melanie Schellenberg, João L. Vilaça, Tobias Czempiel, Zhenkun Wang, Debdoot Sheet, Shrawan Kumar Thapa, Max Berniker, Patrick Godau, Pedro Morais, Sudarshan Regmi, Thuy Nuong Tran, Jaime Fonseca, Jan-Hinrich Nölke, Estevão Lima, Eduard Vazquez, Lena Maier-Hein, Nassir Navab, Pietro Mascagni, Barbara Seeliger, Cristians Gonzalez, Didier Mutter, Nicolas Padoy

Figure 1 for CholecTriplet2022: Show me a tool and tell me the triplet -- an endoscopic vision challenge for surgical action triplet detection
Figure 2 for CholecTriplet2022: Show me a tool and tell me the triplet -- an endoscopic vision challenge for surgical action triplet detection
Figure 3 for CholecTriplet2022: Show me a tool and tell me the triplet -- an endoscopic vision challenge for surgical action triplet detection
Figure 4 for CholecTriplet2022: Show me a tool and tell me the triplet -- an endoscopic vision challenge for surgical action triplet detection

Formalizing surgical activities as triplets of the used instruments, actions performed, and target anatomies is becoming a gold standard approach for surgical activity modeling. The benefit is that this formalization helps to obtain a more detailed understanding of tool-tissue interaction which can be used to develop better Artificial Intelligence assistance for image-guided surgery. Earlier efforts and the CholecTriplet challenge introduced in 2021 have put together techniques aimed at recognizing these triplets from surgical footage. Estimating also the spatial locations of the triplets would offer a more precise intraoperative context-aware decision support for computer-assisted intervention. This paper presents the CholecTriplet2022 challenge, which extends surgical action triplet modeling from recognition to detection. It includes weakly-supervised bounding box localization of every visible surgical instrument (or tool), as the key actors, and the modeling of each tool-activity in the form of <instrument, verb, target> triplet. The paper describes a baseline method and 10 new deep learning algorithms presented at the challenge to solve the task. It also provides thorough methodological comparisons of the methods, an in-depth analysis of the obtained results, their significance, and useful insights for future research directions and applications in surgery.

* MICCAI EndoVis CholecTriplet2022 challenge report. Submitted to journal of Medical Image Analysis. 22 pages, 14 figures, 6 tables 
Viaarxiv icon

ARST: Auto-Regressive Surgical Transformer for Phase Recognition from Laparoscopic Videos

Sep 02, 2022
Xiaoyang Zou, Wenyong Liu, Junchen Wang, Rong Tao, Guoyan Zheng

Figure 1 for ARST: Auto-Regressive Surgical Transformer for Phase Recognition from Laparoscopic Videos
Figure 2 for ARST: Auto-Regressive Surgical Transformer for Phase Recognition from Laparoscopic Videos
Figure 3 for ARST: Auto-Regressive Surgical Transformer for Phase Recognition from Laparoscopic Videos
Figure 4 for ARST: Auto-Regressive Surgical Transformer for Phase Recognition from Laparoscopic Videos

Phase recognition plays an essential role for surgical workflow analysis in computer assisted intervention. Transformer, originally proposed for sequential data modeling in natural language processing, has been successfully applied to surgical phase recognition. Existing works based on transformer mainly focus on modeling attention dependency, without introducing auto-regression. In this work, an Auto-Regressive Surgical Transformer, referred as ARST, is first proposed for on-line surgical phase recognition from laparoscopic videos, modeling the inter-phase correlation implicitly by conditional probability distribution. To reduce inference bias and to enhance phase consistency, we further develop a consistency constraint inference strategy based on auto-regression. We conduct comprehensive validations on a well-known public dataset Cholec80. Experimental results show that our method outperforms the state-of-the-art methods both quantitatively and qualitatively, and achieves an inference rate of 66 frames per second (fps).

* 11 Pages, 3 figures 
Viaarxiv icon