Medical data analysis often combines both imaging and tabular data processing using machine learning algorithms. While previous studies have investigated the impact of attention mechanisms on deep learning models, few have explored integrating attention modules and tabular data. In this paper, we introduce TabAttention, a novel module that enhances the performance of Convolutional Neural Networks (CNNs) with an attention mechanism that is trained conditionally on tabular data. Specifically, we extend the Convolutional Block Attention Module to 3D by adding a Temporal Attention Module that uses multi-head self-attention to learn attention maps. Furthermore, we enhance all attention modules by integrating tabular data embeddings. Our approach is demonstrated on the fetal birth weight (FBW) estimation task, using 92 fetal abdominal ultrasound video scans and fetal biometry measurements. Our results indicate that TabAttention outperforms clinicians and existing methods that rely on tabular and/or imaging data for FBW prediction. This novel approach has the potential to improve computer-aided diagnosis in various clinical workflows where imaging and tabular data are combined. We provide a source code for integrating TabAttention in CNNs at https://github.com/SanoScience/Tab-Attention.
The effectiveness of digital treatments can be measured by requiring patients to self-report their mental and physical state through mobile applications. However, self-reporting can be overwhelming and may cause patients to disengage from the intervention. In order to address this issue, we conduct a feasibility study to explore the impact of gamification on the cognitive burden of self-reporting. Our approach involves the creation of a system to assess cognitive burden through the analysis of photoplethysmography (PPG) signals obtained from a smartwatch. The system is built by collecting PPG data during both cognitively demanding tasks and periods of rest. The obtained data is utilized to train a machine learning model to detect cognitive load (CL). Subsequently, we create two versions of health surveys: a gamified version and a traditional version. Our aim is to estimate the cognitive load experienced by participants while completing these surveys using their mobile devices. We find that CL detector performance can be enhanced via pre-training on stress detection tasks and requires capturing of a minimum 30 seconds of PPG signal to work adequately. For 10 out of 13 participants, a personalized cognitive load detector can achieve an F1 score above 0.7. We find no difference between the gamified and non-gamified mobile surveys in terms of time spent in the state of high cognitive load but participants prefer the gamified version. The average time spent on each question is 5.5 for gamified survey vs 6 seconds for the non-gamified version.
Cardiac Magnetic Resonance Imaging is commonly used for the assessment of the cardiac anatomy and function. The delineations of left and right ventricle blood pools and left ventricular myocardium are important for the diagnosis of cardiac diseases. Unfortunately, the movement of a patient during the CMR acquisition procedure may result in motion artifacts appearing in the final image. Such artifacts decrease the diagnostic quality of CMR images and force redoing of the procedure. In this paper, we present a Multi-task Swin UNEt TRansformer network for simultaneous solving of two tasks in the CMRxMotion challenge: CMR segmentation and motion artifacts classification. We utilize both segmentation and classification as a multi-task learning approach which allows us to determine the diagnostic quality of CMR and generate masks at the same time. CMR images are classified into three diagnostic quality classes, whereas, all samples with non-severe motion artifacts are being segmented. Ensemble of five networks trained using 5-Fold Cross-validation achieves segmentation performance of DICE coefficient of 0.871 and classification accuracy of 0.595.
Spina Bifida (SB) is a birth defect developed during the early stage of pregnancy in which there is incomplete closing of the spine around the spinal cord. The growing interest in fetoscopic Spina-Bifida repair, which is performed in fetuses who are still in the pregnant uterus, prompts the need for appropriate training. The learning curve for such procedures is steep and requires excellent procedural skills. Computer-based virtual reality (VR) simulation systems offer a safe, cost-effective, and configurable training environment free from ethical and patient safety issues. However, to the best of our knowledge, there are currently no commercial or experimental VR training simulation systems available for fetoscopic SB-repair procedures. In this paper, we propose a novel VR simulator for core manual skills training for SB-repair. An initial simulation realism validation study was carried out by obtaining subjective feedback (face and content validity) from 14 clinicians. The overall simulation realism was on average marked 4.07 on a 5-point Likert scale (1 - very unrealistic, 5 - very realistic). Its usefulness as a training tool for SB-repair as well as in learning fundamental laparoscopic skills was marked 4.63 and 4.80, respectively. These results indicate that VR simulation of fetoscopic procedures may contribute to surgical training without putting fetuses and their mothers at risk. It could also facilitate wider adaptation of fetoscopic procedures in place of much more invasive open fetal surgeries.
Fetoscopy laser photocoagulation is a widely adopted procedure for treating Twin-to-Twin Transfusion Syndrome (TTTS). The procedure involves photocoagulation pathological anastomoses to regulate blood exchange among twins. The procedure is particularly challenging due to the limited field of view, poor manoeuvrability of the fetoscope, poor visibility, and variability in illumination. These challenges may lead to increased surgery time and incomplete ablation. Computer-assisted intervention (CAI) can provide surgeons with decision support and context awareness by identifying key structures in the scene and expanding the fetoscopic field of view through video mosaicking. Research in this domain has been hampered by the lack of high-quality data to design, develop and test CAI algorithms. Through the Fetoscopic Placental Vessel Segmentation and Registration (FetReg2021) challenge, which was organized as part of the MICCAI2021 Endoscopic Vision challenge, we released the first largescale multicentre TTTS dataset for the development of generalized and robust semantic segmentation and video mosaicking algorithms. For this challenge, we released a dataset of 2060 images, pixel-annotated for vessels, tool, fetus and background classes, from 18 in-vivo TTTS fetoscopy procedures and 18 short video clips. Seven teams participated in this challenge and their model performance was assessed on an unseen test dataset of 658 pixel-annotated images from 6 fetoscopic procedures and 6 short clips. The challenge provided an opportunity for creating generalized solutions for fetoscopic scene understanding and mosaicking. In this paper, we present the findings of the FetReg2021 challenge alongside reporting a detailed literature review for CAI in TTTS fetoscopy. Through this challenge, its analysis and the release of multi-centre fetoscopic data, we provide a benchmark for future research in this field.
Objective. This work investigates the use of deep convolutional neural networks (CNN) to automatically perform measurements of fetal body parts, including head circumference, biparietal diameter, abdominal circumference and femur length, and to estimate gestational age and fetal weight using fetal ultrasound videos. Approach. We developed a novel multi-task CNN-based spatio-temporal fetal US feature extraction and standard plane detection algorithm (called FUVAI) and evaluated the method on 50 freehand fetal US video scans. We compared FUVAI fetal biometric measurements with measurements made by five experienced sonographers at two time points separated by at least two weeks. Intra- and inter-observer variabilities were estimated. Main results. We found that automated fetal biometric measurements obtained by FUVAI were comparable to the measurements performed by experienced sonographers The observed differences in measurement values were within the range of inter- and intra-observer variability. Moreover, analysis has shown that these differences were not statistically significant when comparing any individual medical expert to our model. Significance. We argue that FUVAI has the potential to assist sonographers who perform fetal biometric measurements in clinical settings by providing them with suggestions regarding the best measuring frames, along with automated measurements. Moreover, FUVAI is able perform these tasks in just a few seconds, which is a huge difference compared to the average of six minutes taken by sonographers. This is significant, given the shortage of medical experts capable of interpreting fetal ultrasound images in numerous countries.
Predicting fetal weight at birth is an important aspect of perinatal care, particularly in the context of antenatal management, which includes the planned timing and the mode of delivery. Accurate prediction of weight using prenatal ultrasound is challenging as it requires images of specific fetal body parts during advanced pregnancy which is difficult to capture due to poor quality of images caused by the lack of amniotic fluid. As a consequence, predictions which rely on standard methods often suffer from significant errors. In this paper we propose the Residual Transformer Module which extends a 3D ResNet-based network for analysis of 2D+t spatio-temporal ultrasound video scans. Our end-to-end method, called BabyNet, automatically predicts fetal birth weight based on fetal ultrasound video scans. We evaluate BabyNet using a dedicated clinical set comprising 225 2D fetal ultrasound videos of pregnancies from 75 patients performed one day prior to delivery. Experimental results show that BabyNet outperforms several state-of-the-art methods and estimates the weight at birth with accuracy comparable to human experts. Furthermore, combining estimates provided by human experts with those computed by BabyNet yields the best results, outperforming either of other methods by a significant margin. The source code of BabyNet is available at https://github.com/SanoScience/BabyNet.
A critical step in the fight against COVID-19, which continues to have a catastrophic impact on peoples lives, is the effective screening of patients presented in the clinics with severe COVID-19 symptoms. Chest radiography is one of the promising screening approaches. Many studies reported detecting COVID-19 in chest X-rays accurately using deep learning. A serious limitation of many published approaches is insufficient attention paid to explaining decisions made by deep learning models. Using explainable artificial intelligence methods, we demonstrate that model decisions may rely on confounding factors rather than medical pathology. After an analysis of potential confounding factors found on chest X-ray images, we propose a novel method to minimise their negative impact. We show that our proposed method is more robust than previous attempts to counter confounding factors such as ECG leads in chest X-rays that often influence model classification decisions. In addition to being robust, our method achieves results comparable to the state-of-the-art. The source code and pre-trained weights are publicly available (https://github.com/tomek1911/POTHER).
In this paper, we propose an end-to-end multi-task neural network called FetalNet with an attention mechanism and stacked module for spatio-temporal fetal ultrasound scan video analysis. Fetal biometric measurement is a standard examination during pregnancy used for the fetus growth monitoring and estimation of gestational age and fetal weight. The main goal in fetal ultrasound scan video analysis is to find proper standard planes to measure the fetal head, abdomen and femur. Due to natural high speckle noise and shadows in ultrasound data, medical expertise and sonographic experience are required to find the appropriate acquisition plane and perform accurate measurements of the fetus. In addition, existing computer-aided methods for fetal US biometric measurement address only one single image frame without considering temporal features. To address these shortcomings, we propose an end-to-end multi-task neural network for spatio-temporal ultrasound scan video analysis to simultaneously localize, classify and measure the fetal body parts. We propose a new encoder-decoder segmentation architecture that incorporates a classification branch. Additionally, we employ an attention mechanism with a stacked module to learn salient maps to suppress irrelevant US regions and efficient scan plane localization. We trained on the fetal ultrasound video comes from routine examinations of 700 different patients. Our method called FetalNet outperforms existing state-of-the-art methods in both classification and segmentation in fetal ultrasound video recordings.
Artificial intelligence (AI) has significant potential to positively impact and advance medical imaging, including positron emission tomography (PET) imaging applications. AI has the ability to enhance and optimize all aspects of the PET imaging chain from patient scheduling, patient setup, protocoling, data acquisition, detector signal processing, reconstruction, image processing and interpretation. AI poses industry-specific challenges which will need to be addressed and overcome to maximize the future potentials of AI in PET. This paper provides an overview of these industry-specific challenges for the development, standardization, commercialization, and clinical adoption of AI, and explores the potential enhancements to PET imaging brought on by AI in the near future. In particular, the combination of on-demand image reconstruction, AI, and custom designed data processing workflows may open new possibilities for innovation which would positively impact the industry and ultimately patients.