On one hand, the transmitted ultrasound beam gets attenuated as propagates through the tissue. On the other hand, the received Radio-Frequency (RF) data contains an additive Gaussian noise which is brought about by the acquisition card and the sensor noise. These two factors lead to a decreasing Signal to Noise Ratio (SNR) in the RF data with depth, effectively rendering deep regions of B-Mode images highly unreliable. There are three common approaches to mitigate this problem. First, increasing the power of transmitted beam which is limited by safety threshold. Averaging consecutive frames is the second option which not only reduces the framerate but also is not applicable for moving targets. And third, reducing the transmission frequency, which deteriorates spatial resolution. Many deep denoising techniques have been developed, but they often require clean data for training the model, which is usually only available in simulated images. Herein, a deep noise reduction approach is proposed which does not need clean training target. The model is constructed between noisy input-output pairs, and the training process interestingly converges to the clean image that is the average of noisy pairs. Experimental results on real phantom as well as ex vivo data confirm the efficacy of the proposed method for noise cancellation.
Time delay estimation (TDE) between two radio-frequency (RF) frames is one of the major steps of quasi-static ultrasound elastography, which detects tissue pathology by estimating its mechanical properties. Regularized optimization-based techniques, a prominent class of TDE algorithms, optimize a non-linear energy functional consisting of data constancy and spatial continuity constraints to obtain the displacement and strain maps between the time-series frames under consideration. The existing optimization-based TDE methods often consider the L2-norm of displacement derivatives to construct the regularizer. However, such a formulation over-penalizes the displacement irregularity and poses two major issues to the estimated strain field. First, the boundaries between different tissues are blurred. Second, the visual contrast between the target and the background is suboptimal. To resolve these issues, herein, we propose a novel TDE algorithm where instead of L2-, L1-norms of both first- and second-order displacement derivatives are taken into account to devise the continuity functional. We handle the non-differentiability of L1-norm by smoothing the absolute value function's sharp corner and optimize the resulting cost function in an iterative manner. We call our technique Second-Order Ultrasound eLastography with L1-norm spatial regularization (L1-SOUL). In terms of both sharpness and visual contrast, L1-SOUL substantially outperforms GLUE, OVERWIND, and SOUL, three recently published TDE algorithms in all validation experiments performed in this study. In cases of simulated, phantom, and in vivo datasets, respectively, L1-SOUL achieves 67.8%, 46.81%, and 117.35% improvements of contrast-to-noise ratio (CNR) over SOUL. The L1-SOUL code can be downloaded from http://code.sonography.ai.
Beamforming is an essential step in the ultrasound image formation pipeline and has attracted growing interest recently. An important goal of beamforming is to improve the quality of the Point Spread Function (PSF), which is far from an ideal Dirac delta function in ultrasound imaging. Therefore, deconvolution as a well-known post-processing method is also used for mitigating the adverse effects of PSF. Unfortunately, these two steps have only been used separately in a sequential approach. Herein, a novel framework for combining both methods in ultrasound image reconstruction is introduced. More specifically, the proposed formulation is a regularized inverse problem including two linear models for beamforming and deconvolution plus additional sparsity constraint. We take benefits of the Alternating Direction Method of Multipliers (ADMM) algorithm to find the solution of the joint optimization problem. The performance evaluation is presented on a set of publicly available simulations, real phantoms, and in vivo data from the Plane-wave Imaging Challenge in Medical UltraSound (PICMUS). Furthermore, the superiority of the proposed approach in comparison with the sequential approach as well as each of other beamforming and deconvolution approaches alone are also shown. Results demonstrate that our approach combines the advantages of both methods and offers ultrasound images with high resolution and contrast.
Medical ultrasound (US) imaging has become a prominent modality for breast cancer imaging due to its ease-of-use, low-cost and safety. In the past decade, convolutional neural networks (CNNs) have emerged as the method of choice in vision applications and have shown excellent potential in automatic classification of US images. Despite their success, their restricted local receptive field limits their ability to learn global context information. Recently, Vision Transformer (ViT) designs that are based on self-attention between image patches have shown great potential to be an alternative to CNNs. In this study, for the first time, we utilize ViT to classify breast US images using different augmentation strategies. The results are provided as classification accuracy and Area Under the Curve (AUC) metrics, and the performance is compared with the state-of-the-art CNNs. The results indicate that the ViT models have comparable efficiency with or even better than the CNNs in classification of US breast images.
Ultrasound Elastography aims to determine the mechanical properties of the tissue by monitoring tissue deformation due to internal or external forces. Tissue deformations are estimated from ultrasound radio frequency (RF) signals and are often referred to as time delay estimation (TDE). Given two RF frames I1 and I2, we can compute a displacement image which shows the change in the position of each sample in I1 to a new position in I2. Two important challenges in TDE include high computational complexity and the difficulty in choosing suitable RF frames. Selecting suitable frames is of high importance because many pairs of RF frames either do not have acceptable deformation for extracting informative strain images or are decorrelated and deformation cannot be reliably estimated. Herein, we introduce a method that learns 12 displacement modes in quasi-static elastography by performing Principal Component Analysis (PCA) on displacement fields of a large training database. In the inference stage, we use dynamic programming (DP) to compute an initial displacement estimate of around 1% of the samples, and then decompose this sparse displacement into a linear combination of the 12 displacement modes. Our method assumes that the displacement of the whole image could also be described by this linear combination of principal components. We then use the GLobal Ultrasound Elastography (GLUE) method to fine-tune the result yielding the exact displacement image. Our method, which we call PCA-GLUE, is more than 10 times faster than DP in calculating the initial displacement map while giving the same result. Our second contribution in this paper is determining the suitability of the frame pair I1 and I2 for strain estimation, which we achieve by using the weight vector that we calculated for PCA-GLUE as an input to a multi-layer perceptron (MLP) classifier.
Medical ultrasound provides images which are the spatial map of the tissue echogenicity. Unfortunately, an ultrasound image is a low-quality version of the expected Tissue Reflectivity Function (TRF) mainly due to the non-ideal Point Spread Function (PSF) of the imaging system. This paper presents a novel beamforming approach based on deep learning to get closer to the ideal PSF in Plane-Wave Imaging (PWI). The proposed approach is designed to reconstruct the desired TRF from echo traces acquired by transducer elements using only a single plane-wave transmission. In this approach, first, an ideal model for the TRF is introduced by setting the imaging PSF as a sharp Gaussian function. Then, a mapping function between the pre-beamformed Radio-Frequency (RF) channel data and the proposed TRF is constructed using deep learning. Network architecture contains multi-resolution decomposition and reconstruction using wavelet transform for effective recovery of high-frequency content of the desired TRF. Inspired by curriculum learning, we exploit step by step training from coarse (mean square error) to fine ($\ell_{0.2}$) loss functions. The proposed method is trained on a large number of simulation ultrasound data with the ground-truth echogenicity map extracted from real photographic images. The performance of the trained network is evaluated on the publicly available simulation and \textit{in vivo} test data without any further fine-tuning. Simulation test results confirm that the proposed method reconstructs images with a high quality in terms of resolution and contrast, which are also visually similar to the proposed ground-truth image. Furthermore, \textit{in vivo} results show that the trained mapping function preserves its performance in the new domain. Therefore, the proposed approach maintains high resolution, contrast, and framerate simultaneously.
A common issue in exploiting simulated ultrasound data for training neural networks is the domain shift problem, where the trained models on synthetic data are not generalizable to clinical data. Recently, Fourier Domain Adaptation (FDA) has been proposed in the field of computer vision to tackle the domain shift problem by replacing the magnitude of the low-frequency spectrum of a synthetic sample (source) with a real sample (target). This method is attractive in ultrasound imaging given that two important differences between synthetic and real ultrasound data are caused by unknown values of attenuation and speed of sound (SOS) in real tissues. Attenuation leads to slow variations in the amplitude of the B-mode image, and SOS mismatch creates aberration and subsequent blurring. As such, both domain shifts cause differences in the low-frequency components of the envelope data, which are replaced in the proposed method. We demonstrate that applying the FDA method to the synthetic data, simulated by Field II, obtains an 3.5\% higher Dice similarity coefficient for a breast lesion segmentation task.
Convolutional neural networks (CNNs) have attracted a rapidly growing interest in a variety of different processing tasks in the medical ultrasound community. However, the performance of CNNs is highly reliant on both the amount and fidelity of the training data. Therefore, scarce data is almost always a concern, particularly in the medical field, where clinical data is not easily accessible. The utilization of synthetic data is a popular approach to address this challenge. However, but simulating a large number of images using packages such as Field II is time-consuming, and the distribution of simulated images is far from that of the real images. Herein, we introduce a novel ultra-fast ultrasound image simulation method based on the Fourier transform and evaluate its performance in a lesion segmentation task. We demonstrate that data augmentation using the images generated by the proposed method substantially outperforms Field II in terms of Dice similarity coefficient, while the simulation is almost 36000 times faster (both on CPU).
Quantitative ultrasound (QUS) parameters such as the effective scatterer diameter (ESD) reveal tissue properties by analyzing ultrasound backscattered echo signal. ESD can be attained through parametrizing backscatter coefficient using form factor models. However, reporting a single scatterer size cannot accurately characterize a tissue, particularly when the media contains scattering sources with a broad range of sizes. Here we estimate the probability of contribution of each scatterer size by modeling the measured form factor as a linear combination of form factors from individual sacatterer sizes. We perform the estimation using two novel techniques. In the first technique, we cast scatterer size distribution as an optimization problem, and efficiently solve it using a linear system of equations. In the second technique, we use the solution of this system of equations to constrain the optimization function, and solve the constrained problem. The methods are evaluated in simulated backscattered coefficients using Faran theory. We evaluate the robustness of the proposed techniques by adding Gaussian noise. The results show that both methods can accurately estimate the scatterer size distribution, and that the second method outperforms the first one.
On-line segmentation of the uterus can aid effective image-based guidance for precise delivery of dose to the target tissue (the uterocervix) during cervix cancer radiotherapy. 3D ultrasound (US) can be used to image the uterus, however, finding the position of uterine boundary in US images is a challenging task due to large daily positional and shape changes in the uterus, large variation in bladder filling, and the limitations of 3D US images such as low resolution in the elevational direction and imaging aberrations. Previous studies on uterus segmentation mainly focused on developing semi-automatic algorithms where require manual initialization to be done by an expert clinician. Due to limited studies on the automatic 3D uterus segmentation, the aim of the current study was to overcome the need for manual initialization in the semi-automatic algorithms using the recent deep learning-based algorithms. Therefore, we developed 2D UNet-based networks that are trained based on two scenarios. In the first scenario, we trained 3 different networks on each plane (i.e., sagittal, coronal, axial) individually. In the second scenario, our proposed network was trained using all the planes of each 3D volume. Our proposed schematic can overcome the initial manual selection of previous semi-automatic algorithm.