Alert button
Picture for Guang-Zhong Yang

Guang-Zhong Yang

Alert button

CDFI: Cross Domain Feature Interaction for Robust Bronchi Lumen Detection

Apr 18, 2023
Jiasheng Xu, Tianyi Zhang, Yangqian Wu, Jie Yang, Guang-Zhong Yang, Yun Gu

Figure 1 for CDFI: Cross Domain Feature Interaction for Robust Bronchi Lumen Detection
Figure 2 for CDFI: Cross Domain Feature Interaction for Robust Bronchi Lumen Detection
Figure 3 for CDFI: Cross Domain Feature Interaction for Robust Bronchi Lumen Detection
Figure 4 for CDFI: Cross Domain Feature Interaction for Robust Bronchi Lumen Detection

Endobronchial intervention is increasingly used as a minimally invasive means for the treatment of pulmonary diseases. In order to reduce the difficulty of manipulation in complex airway networks, robust lumen detection is essential for intraoperative guidance. However, these methods are sensitive to visual artifacts which are inevitable during the surgery. In this work, a cross domain feature interaction (CDFI) network is proposed to extract the structural features of lumens, as well as to provide artifact cues to characterize the visual features. To effectively extract the structural and artifact features, the Quadruple Feature Constraints (QFC) module is designed to constrain the intrinsic connections of samples with various imaging-quality. Furthermore, we design a Guided Feature Fusion (GFF) module to supervise the model for adaptive feature fusion based on different types of artifacts. Results show that the features extracted by the proposed method can preserve the structural information of lumen in the presence of large visual variations, bringing much-improved lumen detection accuracy.

* 7 pages, 4 figures 
Viaarxiv icon

Multi-site, Multi-domain Airway Tree Modeling (ATM'22): A Public Benchmark for Pulmonary Airway Segmentation

Mar 10, 2023
Minghui Zhang, Yangqian Wu, Hanxiao Zhang, Yulei Qin, Hao Zheng, Wen Tang, Corey Arnold, Chenhao Pei, Pengxin Yu, Yang Nan, Guang Yang, Simon Walsh, Dominic C. Marshall, Matthieu Komorowski, Puyang Wang, Dazhou Guo, Dakai Jin, Ya'nan Wu, Shuiqing Zhao, Runsheng Chang, Boyu Zhang, Xing Lv, Abdul Qayyum, Moona Mazher, Qi Su, Yonghuang Wu, Ying'ao Liu, Yufei Zhu, Jiancheng Yang, Ashkan Pakzad, Bojidar Rangelov, Raul San Jose Estepar, Carlos Cano Espinosa, Jiayuan Sun, Guang-Zhong Yang, Yun Gu

Figure 1 for Multi-site, Multi-domain Airway Tree Modeling (ATM'22): A Public Benchmark for Pulmonary Airway Segmentation
Figure 2 for Multi-site, Multi-domain Airway Tree Modeling (ATM'22): A Public Benchmark for Pulmonary Airway Segmentation
Figure 3 for Multi-site, Multi-domain Airway Tree Modeling (ATM'22): A Public Benchmark for Pulmonary Airway Segmentation
Figure 4 for Multi-site, Multi-domain Airway Tree Modeling (ATM'22): A Public Benchmark for Pulmonary Airway Segmentation

Open international challenges are becoming the de facto standard for assessing computer vision and image analysis algorithms. In recent years, new methods have extended the reach of pulmonary airway segmentation that is closer to the limit of image resolution. Since EXACT'09 pulmonary airway segmentation, limited effort has been directed to quantitative comparison of newly emerged algorithms driven by the maturity of deep learning based approaches and clinical drive for resolving finer details of distal airways for early intervention of pulmonary diseases. Thus far, public annotated datasets are extremely limited, hindering the development of data-driven methods and detailed performance evaluation of new algorithms. To provide a benchmark for the medical imaging community, we organized the Multi-site, Multi-domain Airway Tree Modeling (ATM'22), which was held as an official challenge event during the MICCAI 2022 conference. ATM'22 provides large-scale CT scans with detailed pulmonary airway annotation, including 500 CT scans (300 for training, 50 for validation, and 150 for testing). The dataset was collected from different sites and it further included a portion of noisy COVID-19 CTs with ground-glass opacity and consolidation. Twenty-three teams participated in the entire phase of the challenge and the algorithms for the top ten teams are reviewed in this paper. Quantitative and qualitative results revealed that deep learning models embedded with the topological continuity enhancement achieved superior performance in general. ATM'22 challenge holds as an open-call design, the training data and the gold standard evaluation are available upon successful registration via its homepage.

* 32 pages, 16 figures. Homepage: https://atm22.grand-challenge.org/. Submitted 
Viaarxiv icon

MR Elastography with Optimization-Based Phase Unwrapping and Traveling Wave Expansion-based Neural Network (TWENN)

Jan 06, 2023
Shengyuan Ma, Runke Wang, Suhao Qiu, Ruokun Li, Qi Yue, Qingfang Sun, Liang Chen, Fuhua Yan, Guang-Zhong Yang, Yuan Feng

Figure 1 for MR Elastography with Optimization-Based Phase Unwrapping and Traveling Wave Expansion-based Neural Network (TWENN)
Figure 2 for MR Elastography with Optimization-Based Phase Unwrapping and Traveling Wave Expansion-based Neural Network (TWENN)
Figure 3 for MR Elastography with Optimization-Based Phase Unwrapping and Traveling Wave Expansion-based Neural Network (TWENN)
Figure 4 for MR Elastography with Optimization-Based Phase Unwrapping and Traveling Wave Expansion-based Neural Network (TWENN)

Magnetic Resonance Elastography (MRE) can characterize biomechanical properties of soft tissue for disease diagnosis and treatment planning. However, complicated wavefields acquired from MRE coupled with noise pose challenges for accurate displacement extraction and modulus estimation. Here we propose a pipeline for processing MRE images using optimization-based displacement extraction and Traveling Wave Expansion-based Neural Network (TWENN) modulus estimation. Phase unwrapping and displacement extraction were achieved by optimization of an objective function with Dual Data Consistency (Dual-DC). A complex-valued neural network using displacement covariance as input has been constructed for the estimation of complex wavenumbers. A model of traveling wave expansion is used to generate training datasets with different levels of noise for the network. The complex shear modulus map is obtained by a fusion of multifrequency and multidirectional data. Validation using images of brain and liver simulation demonstrates the practical value of the proposed pipeline, which can estimate the biomechanical properties with minimum root-mean-square-errors compared with state-of-the-art methods. Applications of the proposed method for processing MRE images of phantom, brain, and liver show clear anatomical features and that the pipeline is robust to noise and has a good generalization capability.

Viaarxiv icon

Revisiting Self-Supervised Contrastive Learning for Facial Expression Recognition

Oct 08, 2022
Yuxuan Shu, Xiao Gu, Guang-Zhong Yang, Benny Lo

Figure 1 for Revisiting Self-Supervised Contrastive Learning for Facial Expression Recognition
Figure 2 for Revisiting Self-Supervised Contrastive Learning for Facial Expression Recognition
Figure 3 for Revisiting Self-Supervised Contrastive Learning for Facial Expression Recognition
Figure 4 for Revisiting Self-Supervised Contrastive Learning for Facial Expression Recognition

The success of most advanced facial expression recognition works relies heavily on large-scale annotated datasets. However, it poses great challenges in acquiring clean and consistent annotations for facial expression datasets. On the other hand, self-supervised contrastive learning has gained great popularity due to its simple yet effective instance discrimination training strategy, which can potentially circumvent the annotation issue. Nevertheless, there remain inherent disadvantages of instance-level discrimination, which are even more challenging when faced with complicated facial representations. In this paper, we revisit the use of self-supervised contrastive learning and explore three core strategies to enforce expression-specific representations and to minimize the interference from other facial attributes, such as identity and face styling. Experimental results show that our proposed method outperforms the current state-of-the-art self-supervised learning methods, in terms of both categorical and dimensional facial expression recognition tasks.

* Accepted to BMVC 2022 
Viaarxiv icon

Differentiable Topology-Preserved Distance Transform for Pulmonary Airway Segmentation

Sep 17, 2022
Minghui Zhang, Guang-Zhong Yang, Yun Gu

Figure 1 for Differentiable Topology-Preserved Distance Transform for Pulmonary Airway Segmentation
Figure 2 for Differentiable Topology-Preserved Distance Transform for Pulmonary Airway Segmentation
Figure 3 for Differentiable Topology-Preserved Distance Transform for Pulmonary Airway Segmentation
Figure 4 for Differentiable Topology-Preserved Distance Transform for Pulmonary Airway Segmentation

Detailed pulmonary airway segmentation is a clinically important task for endobronchial intervention and treatment of peripheral lung cancer lesions. Convolutional Neural Networks (CNNs) are promising tools for medical image analysis but have been performing poorly for cases when there is a significantly imbalanced feature distribution, which is true for the airway data as the trachea and principal bronchi dominate most of the voxels whereas the lobar bronchi and distal segmental bronchi occupy only a small proportion. In this paper, we propose a Differentiable Topology-Preserved Distance Transform (DTPDT) framework to improve the performance of airway segmentation. A Topology-Preserved Surrogate (TPS) learning strategy is first proposed to equalize the training progress within-class distribution. Furthermore, a Convolutional Distance Transform (CDT) is designed to identify the breakage phenomenon with improved sensitivity, minimizing the variation of the distance map between the prediction and ground-truth. The proposed method is validated with the publicly available reference airway segmentation datasets.

* 12 pages, 7 figures 
Viaarxiv icon

A Compacted Structure for Cross-domain learning on Monocular Depth and Flow Estimation

Aug 25, 2022
Yu Chen, Xu Cao, Xiaoyi Lin, Baoru Huang, Xiao-Yun Zhou, Jian-Qing Zheng, Guang-Zhong Yang

Figure 1 for A Compacted Structure for Cross-domain learning on Monocular Depth and Flow Estimation
Figure 2 for A Compacted Structure for Cross-domain learning on Monocular Depth and Flow Estimation
Figure 3 for A Compacted Structure for Cross-domain learning on Monocular Depth and Flow Estimation
Figure 4 for A Compacted Structure for Cross-domain learning on Monocular Depth and Flow Estimation

Accurate motion and depth recovery is important for many robot vision tasks including autonomous driving. Most previous studies have achieved cooperative multi-task interaction via either pre-defined loss functions or cross-domain prediction. This paper presents a multi-task scheme that achieves mutual assistance by means of our Flow to Depth (F2D), Depth to Flow (D2F), and Exponential Moving Average (EMA). F2D and D2F mechanisms enable multi-scale information integration between optical flow and depth domain based on differentiable shallow nets. A dual-head mechanism is used to predict optical flow for rigid and non-rigid motion based on a divide-and-conquer manner, which significantly improves the optical flow estimation performance. Furthermore, to make the prediction more robust and stable, EMA is used for our multi-task training. Experimental results on KITTI datasets show that our multi-task scheme outperforms other multi-task schemes and provide marked improvements on the prediction results.

Viaarxiv icon

Re-thinking and Re-labeling LIDC-IDRI for Robust Pulmonary Cancer Prediction

Jul 28, 2022
Hanxiao Zhang, Xiao Gu, Minghui Zhang, Weihao Yu, Liang Chen, Zhexin Wang, Feng Yao, Yun Gu, Guang-Zhong Yang

Figure 1 for Re-thinking and Re-labeling LIDC-IDRI for Robust Pulmonary Cancer Prediction
Figure 2 for Re-thinking and Re-labeling LIDC-IDRI for Robust Pulmonary Cancer Prediction
Figure 3 for Re-thinking and Re-labeling LIDC-IDRI for Robust Pulmonary Cancer Prediction
Figure 4 for Re-thinking and Re-labeling LIDC-IDRI for Robust Pulmonary Cancer Prediction

The LIDC-IDRI database is the most popular benchmark for lung cancer prediction. However, with subjective assessment from radiologists, nodules in LIDC may have entirely different malignancy annotations from the pathological ground truth, introducing label assignment errors and subsequent supervision bias during training. The LIDC database thus requires more objective labels for learning-based cancer prediction. Based on an extra small dataset containing 180 nodules diagnosed by pathological examination, we propose to re-label LIDC data to mitigate the effect of original annotation bias verified on this robust benchmark. We demonstrate in this paper that providing new labels by similar nodule retrieval based on metric learning would be an effective re-labeling strategy. Training on these re-labeled LIDC nodules leads to improved model performance, which is enhanced when new labels of uncertain nodules are added. We further infer that re-labeling LIDC is current an expedient way for robust lung cancer prediction while building a large pathological-proven nodule database provides the long-term solution.

Viaarxiv icon

Tackling Long-Tailed Category Distribution Under Domain Shifts

Jul 20, 2022
Xiao Gu, Yao Guo, Zeju Li, Jianing Qiu, Qi Dou, Yuxuan Liu, Benny Lo, Guang-Zhong Yang

Figure 1 for Tackling Long-Tailed Category Distribution Under Domain Shifts
Figure 2 for Tackling Long-Tailed Category Distribution Under Domain Shifts
Figure 3 for Tackling Long-Tailed Category Distribution Under Domain Shifts
Figure 4 for Tackling Long-Tailed Category Distribution Under Domain Shifts

Machine learning models fail to perform well on real-world applications when 1) the category distribution P(Y) of the training dataset suffers from long-tailed distribution and 2) the test data is drawn from different conditional distributions P(X|Y). Existing approaches cannot handle the scenario where both issues exist, which however is common for real-world applications. In this study, we took a step forward and looked into the problem of long-tailed classification under domain shifts. We designed three novel core functional blocks including Distribution Calibrated Classification Loss, Visual-Semantic Mapping and Semantic-Similarity Guided Augmentation. Furthermore, we adopted a meta-learning framework which integrates these three blocks to improve domain generalization on unseen target domains. Two new datasets were proposed for this problem, named AWA2-LTS and ImageNet-LTS. We evaluated our method on the two datasets and extensive experimental results demonstrate that our proposed method can achieve superior performance over state-of-the-art long-tailed/domain generalization approaches and the combinations. Source codes and datasets can be found at our project page https://xiaogu.site/LTDS.

* accepted to ECCV 2022 
Viaarxiv icon

Human-Robot Shared Control for Surgical Robot Based on Context-Aware Sim-to-Real Adaptation

Apr 23, 2022
Dandan Zhang, Zicong Wu, Junhong Chen, Ruiqi Zhu, Adnan Munawar, Bo Xiao, Yuan Guan, Hang Su, Wuzhou Hong, Yao Guo, Gregory S. Fischer, Benny Lo, Guang-Zhong Yang

Figure 1 for Human-Robot Shared Control for Surgical Robot Based on Context-Aware Sim-to-Real Adaptation
Figure 2 for Human-Robot Shared Control for Surgical Robot Based on Context-Aware Sim-to-Real Adaptation
Figure 3 for Human-Robot Shared Control for Surgical Robot Based on Context-Aware Sim-to-Real Adaptation
Figure 4 for Human-Robot Shared Control for Surgical Robot Based on Context-Aware Sim-to-Real Adaptation

Human-robot shared control, which integrates the advantages of both humans and robots, is an effective approach to facilitate efficient surgical operation. Learning from demonstration (LfD) techniques can be used to automate some of the surgical subtasks for the construction of the shared control mechanism. However, a sufficient amount of data is required for the robot to learn the manoeuvres. Using a surgical simulator to collect data is a less resource-demanding approach. With sim-to-real adaptation, the manoeuvres learned from a simulator can be transferred to a physical robot. To this end, we propose a sim-to-real adaptation method to construct a human-robot shared control framework for robotic surgery. In this paper, a desired trajectory is generated from a simulator using LfD method, while dynamic motion primitives (DMP) is used to transfer the desired trajectory from the simulator to the physical robotic platform. Moreover, a role adaptation mechanism is developed such that the robot can adjust its role according to the surgical operation contexts predicted by a neural network model. The effectiveness of the proposed framework is validated on the da Vinci Research Kit (dVRK). Results of the user studies indicated that with the adaptive human-robot shared control framework, the path length of the remote controller, the total clutching number and the task completion time can be reduced significantly. The proposed method outperformed the traditional manual control via teleoperation.

* Accepted by ICRA 
Viaarxiv icon

A Long Short-term Memory Based Recurrent Neural Network for Interventional MRI Reconstruction

Apr 12, 2022
Ruiyang Zhao, Zhao He, Tao Wang, Suhao Qiu, Pawel Herman, Yanle Hu, Chencheng Zhang, Dinggang Shen, Bomin Sun, Guang-Zhong Yang, Yuan Feng

Figure 1 for A Long Short-term Memory Based Recurrent Neural Network for Interventional MRI Reconstruction
Figure 2 for A Long Short-term Memory Based Recurrent Neural Network for Interventional MRI Reconstruction
Figure 3 for A Long Short-term Memory Based Recurrent Neural Network for Interventional MRI Reconstruction
Figure 4 for A Long Short-term Memory Based Recurrent Neural Network for Interventional MRI Reconstruction

Interventional magnetic resonance imaging (i-MRI) for surgical guidance could help visualize the interventional process such as deep brain stimulation (DBS), improving the surgery performance and patient outcome. Different from retrospective reconstruction in conventional dynamic imaging, i-MRI for DBS has to acquire and reconstruct the interventional images sequentially online. Here we proposed a convolutional long short-term memory (Conv-LSTM) based recurrent neural network (RNN), or ConvLR, to reconstruct interventional images with golden-angle radial sampling. By using an initializer and Conv-LSTM blocks, the priors from the pre-operative reference image and intra-operative frames were exploited for reconstructing the current frame. Data consistency for radial sampling was implemented by a soft-projection method. To improve the reconstruction accuracy, an adversarial learning strategy was adopted. A set of interventional images based on the pre-operative and post-operative MR images were simulated for algorithm validation. Results showed with only 10 radial spokes, ConvLR provided the best performance compared with state-of-the-art methods, giving an acceleration up to 40 folds. The proposed algorithm has the potential to achieve real-time i-MRI for DBS and can be used for general purpose MR-guided intervention.

Viaarxiv icon