Image segmentation plays a crucial role in extracting important objects of interest from images, enabling various applications. While existing methods have shown success in segmenting clean images, they often struggle to produce accurate segmentation results when dealing with degraded images, such as those containing noise or occlusions. To address this challenge, interactive segmentation has emerged as a promising approach, allowing users to provide meaningful input to guide the segmentation process. However, an important problem in interactive segmentation lies in determining how to incorporate minimal yet meaningful user guidance into the segmentation model. In this paper, we propose the quasi-conformal interactive segmentation (QIS) model, which incorporates user input in the form of positive and negative clicks. Users mark a few pixels belonging to the object region as positive clicks, indicating that the segmentation model should include a region around these clicks. Conversely, negative clicks are provided on pixels belonging to the background, instructing the model to exclude the region near these clicks from the segmentation mask. Additionally, the segmentation mask is obtained by deforming a template mask with the same topology as the object of interest using an orientation-preserving quasiconformal mapping. This approach helps to avoid topological errors in the segmentation results. We provide a thorough analysis of the proposed model, including theoretical support for the ability of QIS to include or exclude regions of interest or disinterest based on the user's indication. To evaluate the performance of QIS, we conduct experiments on synthesized images, medical images, natural images and noisy natural images. The results demonstrate the efficacy of our proposed method.
Accurate analysis and classification of facial attributes are essential in various applications, from human-computer interaction to security systems. In this work, a novel approach to enhance facial classification and recognition tasks through the integration of 3D facial models with deep learning methods was proposed. We extract the most useful information for various tasks using the 3D Facial Model, leading to improved classification accuracy. Combining 3D facial insights with ResNet architecture, our approach achieves notable results: 100% individual classification, 95.4% gender classification, and 83.5% expression classification accuracy. This method holds promise for advancing facial analysis and recognition research.
Images degraded by geometric distortions pose a significant challenge to imaging and computer vision tasks such as object recognition. Deep learning-based imaging models usually fail to give accurate performance for geometrically distorted images. In this paper, we propose the deformation-invariant neural network (DINN), a framework to address the problem of imaging tasks for geometrically distorted images. The DINN outputs consistent latent features for images that are geometrically distorted but represent the same underlying object or scene. The idea of DINN is to incorporate a simple component, called the quasiconformal transformer network (QCTN), into other existing deep networks for imaging tasks. The QCTN is a deep neural network that outputs a quasiconformal map, which can be used to transform a geometrically distorted image into an improved version that is closer to the distribution of natural or good images. It first outputs a Beltrami coefficient, which measures the quasiconformality of the output deformation map. By controlling the Beltrami coefficient, the local geometric distortion under the quasiconformal mapping can be controlled. The QCTN is lightweight and simple, which can be readily integrated into other existing deep neural networks to enhance their performance. Leveraging our framework, we have developed an image classification network that achieves accurate classification of distorted images. Our proposed framework has been applied to restore geometrically distorted images by atmospheric turbulence and water turbulence. DINN outperforms existing GAN-based restoration methods under these scenarios, demonstrating the effectiveness of the proposed framework. Additionally, we apply our proposed framework to the 1-1 verification of human face images under atmospheric turbulence and achieve satisfactory performance, further demonstrating the efficacy of our approach.
Fast and accurate MRI reconstruction is a key concern in modern clinical practice. Recently, numerous Deep-Learning methods have been proposed for MRI reconstruction, however, they usually fail to reconstruct sharp details from the subsampled k-space data. To solve this problem, we propose a lightweight and accurate Edge Attention MRI Reconstruction Network (EAMRI) to reconstruct images with edge guidance. Specifically, we design an efficient Edge Prediction Network to directly predict accurate edges from the blurred image. Meanwhile, we propose a novel Edge Attention Module (EAM) to guide the image reconstruction utilizing the extracted edge priors, as inspired by the popular self-attention mechanism. EAM first projects the input image and edges into Q_image, K_edge, and V_image, respectively. Then EAM pairs the Q_image with K_edge along the channel dimension, such that 1) it can search globally for the high-frequency image features that are activated by the edge priors; 2) the overall computation burdens are largely reduced compared with the traditional spatial-wise attention. With the help of EAM, the predicted edge priors can effectively guide the model to reconstruct high-quality MR images with accurate edges. Extensive experiments show that our proposed EAMRI outperforms other methods with fewer parameters and can recover more accurate edges.
Medical image segmentation aims to automatically extract anatomical or pathological structures in the human body. Most objects or regions of interest are of similar patterns. For example, the relative location and the relative size of the lung and the kidney differ little among subjects. Incorporating these morphology rules as prior knowledge into the segmentation model is believed to be an effective way to enhance the accuracy of the segmentation results. Motivated by this, we propose in this work the Topology-Preserving Segmentation Network (TPSN) which can predict segmentation masks with the same topology prescribed for specific tasks. TPSN is a deformation-based model that yields a deformation map through an encoder-decoder architecture to warp the template masks into a target shape approximating the region to segment. Comparing to the segmentation framework based on pixel-wise classification, deformation-based segmentation models that warp a template to enclose the regions are more convenient to enforce geometric constraints. In our framework, we carefully design the ReLU Jacobian regularization term to enforce the bijectivity of the deformation map. As such, the predicted mask by TPSN has the same topology as that of the template prior mask.
In medical imaging, surface registration is extensively used for performing systematic comparisons between anatomical structures, with a prime example being the highly convoluted brain cortical surfaces. To obtain a meaningful registration, a common approach is to identify prominent features on the surfaces and establish a low-distortion mapping between them with the feature correspondence encoded as landmark constraints. Prior registration works have primarily focused on using manually labeled landmarks and solving highly nonlinear optimization problems, which are time-consuming and hence hinder practical applications. In this work, we propose a novel framework for the automatic landmark detection and registration of brain cortical surfaces using quasi-conformal geometry and convolutional neural networks. We first develop a landmark detection network (LD-Net) that allows for the automatic extraction of landmark curves given two prescribed starting and ending points based on the surface geometry. We then utilize the detected landmarks and quasi-conformal theory for achieving the surface registration. Specifically, we develop a coefficient prediction network (CP-Net) for predicting the Beltrami coefficients associated with the desired landmark-based registration and a mapping network called the disk Beltrami solver network (DBS-Net) for generating quasi-conformal mappings from the predicted Beltrami coefficients, with the bijectivity guaranteed by quasi-conformal theory. Experimental results are presented to demonstrate the effectiveness of our proposed framework. Altogether, our work paves a new way for surface-based morphometry and medical shape analysis.
Medical image segmentation, which aims to automatically extract anatomical or pathological structures, plays a key role in computer-aided diagnosis and disease analysis. Despite the problem has been widely studied, existing methods are prone to topological errors. In medical imaging, the topology of the structure, such as the kidney or lung, is usually known. Preserving the topology of the structure in the segmentation process is of utmost importance for accurate image analysis. In this work, a novel learning-based segmentation model is proposed. A {\it topology-preserving segmentation network (TPSN)} is trained to give an accurate segmentation result of an input image that preserves the prescribed topology. TPSN is a deformation-based model that yields a deformation map through a UNet, which takes the medical image and a template mask as inputs. The main idea is to deform a template mask describing the prescribed topology by a diffeomorphism to segment the object in the image. The topology of the shape in the template mask is well preserved under the diffeomorphic map. The diffeomorphic property of the map is controlled by introducing a regularization term related to the Jacobian in the loss function. As such, a topology-preserving segmentation result can be guaranteed. Furthermore, a multi-scale TPSN is developed in this paper that incorporates multi-level information of images to produce more precise segmentation results. To evaluate our method, we applied the 2D TPSN on Ham10000 and 3D TPSN on KiTS21. Experimental results illustrate our method outperforms the baseline UNet segmentation model with/without connected-component analysis (CCA) by both the dice score and IoU score. Besides, results show that our method can produce reliable results even in challenging cases, where pixel-wise segmentation models by UNet and CCA fail to obtain accurate results.
Image registration, the process of defining meaningful correspondences between images, is essential for various image analysis tasks, especially medical imaging. Numerous learning-based methods, notably convolutional neural networks (CNNs), for deformable image registration proposed in recent years have demonstrated the feasibility and superiority of deep learning techniques for registration problems. Besides, compared to traditional algorithms' optimization scheme of the objective function for each image pair, learning-based algorithms are several orders of magnitude faster. However, these data-driven methods without proper constraint on the deformation field will easily lead to topological foldings. To tackle this problem, We propose the quasi-conformal registration network (QCRegNet), an unsupervised learning framework, to obtain diffeomorphic 2D image registrations with large deformations based on quasi-conformal (QC) map, an orientation-preserving homeomorphism between two manifolds. The basic idea is to design a CNN mapping image pairs to deformation fields. QCRegNet consists of the estimator network and the Beltrami solver network (BSNet). The estimator network takes image pair as input and outputs the Beltrami coefficient (BC). The BC, which captures conformal distortion of a QC map and guarantees the bijectivity, will then be input to the BSNet, a task-independent network which reconstructs the desired QC map. Furthermore, we reduce the number of network parameters and computational complexity by utilizing Fourier approximation to compress BC. Experiments have been carried out on different data such as underwater and medical images. Registration results show that the registration accuracy is comparable to state-of-the-art methods and diffeomorphism is to a great extent guaranteed compared to other diffeomorphic registration algorithms.
With the advancement of computer technology, there is a surge of interest in effective mapping methods for objects in higher-dimensional spaces. To establish a one-to-one correspondence between objects, higher-dimensional quasi-conformal theory can be utilized for ensuring the bijectivity of the mappings. In addition, it is often desirable for the mappings to satisfy certain prescribed geometric constraints and possess low distortion in conformality or volume. In this work, we develop a unifying framework for computing $n$-dimensional quasi-conformal mappings. More specifically, we propose a variational model that integrates quasi-conformal distortion, volumetric distortion, landmark correspondence, intensity mismatch and volume prior information to handle a large variety of deformation problems. We further prove the existence of a minimizer for the proposed model and devise efficient numerical methods to solve the optimization problem. We demonstrate the effectiveness of the proposed framework using various experiments in two- and three-dimensions, with applications to medical image registration, adaptive remeshing and shape modeling.
Image segmentation is to extract meaningful objects from a given image. For degraded images due to occlusions, obscurities or noises, the accuracy of the segmentation result can be severely affected. To alleviate this problem, prior information about the target object is usually introduced. In [10], a topology-preserving registration-based segmentation model was proposed, which is restricted to segment 2D images only. In this paper, we propose a novel 3D topology-preserving registration-based segmentation model with the hyperelastic regularization, which can handle both 2D and 3D images. The existence of the solution of the proposed model is established. We also propose a converging iterative scheme to solve the proposed model. Numerical experiments have been carried out on the synthetic and real images, which demonstrate the effectiveness of our proposed model.