Abstract:Speckle tracking echocardiography (STE) is the clinical standard for myocardial strain estimation. Despite good performance on global strain (GLS), its accuracy for regional strain remains limited, even though this biomarker is highly relevant for early diagnosis and the characterization of subtle abnormalities. from clinical data. Deep learning is a promising alternative, but its development is constrained by the lack of reliable motion references. Existing solutions rely either on STE-derived labels or on simulations generated by physics-based models, but these synthetic sequences still have limited realism compared with clinical data.In this paper, we propose a novel simulation strategy that incorporates speckle decorrelation measures from real videos and uses an iterative refinement process to improve the motion realism in the simulations. We created an open-source photorealistic dataset of 1,478 videos with reference motion, which was used to train an echocardiographic motion estimation algorithm. The proposed method achieves unmatched performance on global and regional strain, notably reaching a GLS variability of 1.42% in an inter-expert setting compared to 1.78% for the clinical reference.
Abstract:Myocardial point tracking (MPT) has recently emerged as a promising direction for motion estimation in echocardiography, driven by advances in general-purpose point tracking methods. However, myocardial motion fundamentally differs from motion encountered in natural videos, as it arises from physiologically constrained deformation that is spatially and temporally continuous throughout the cardiac cycle. Consequently, motion trajectories typically remain locally confined despite substantial tissue deformation. Motivated by these properties, we revisit the architectural design for MPT and find that coarse initialization in commonly used two-stage coarse-to-fine architectures may be unnecessary in this domain. In this work, we propose a fine-stage-only architecture, \textbf{EchoTracker2}, which enriches pixel-precise features with local spatiotemporal context and integrates them with long-range joint temporal reasoning for robust tracking. Experimental results across in-distribution, out-of-distribution (OOD), and public synthetic datasets show that our model improves position accuracy by $6.5\%$ and reduces median trajectory error by $12.2\%$ relative to a domain-specific state-of-the-art (SOTA) model. Compared to the best general-purpose point tracking method, the improvements are $2.0\%$ and $5.3\%$, respectively. Moreover, EchoTracker2 shows better agreement with expert-derived global longitudinal strain (GLS) and enhances test-rest reproducibility. Source code will be available at: https://github.com/riponazad/ptecho.
Abstract:Deep learning methods for point tracking are applicable in 2D echocardiography, but do not yet take advantage of domain specifics that enable extremely fast and efficient configurations. We developed MyoTracker, a low-complexity architecture (0.3M parameters) for point tracking in echocardiography. It builds on the CoTracker2 architecture by simplifying its components and extending the temporal context to provide point predictions for the entire sequence in a single step. We applied MyoTracker to the right ventricular (RV) myocardium in RV-focused recordings and compared the results with those of CoTracker2 and EchoTracker, another specialized point tracking architecture for echocardiography. MyoTracker achieved the lowest average point trajectory error at 2.00 $\pm$ 0.53 mm. Calculating RV Free Wall Strain (RV FWS) using MyoTracker's point predictions resulted in a -0.3$\%$ bias with 95$\%$ limits of agreement from -6.1$\%$ to 5.4$\%$ compared to reference values from commercial software. This range falls within the interobserver variability reported in previous studies. The limits of agreement were wider for both CoTracker2 and EchoTracker, worse than the interobserver variability. At inference, MyoTracker used 67$\%$ less GPU memory than CoTracker2 and 84$\%$ less than EchoTracker on large sequences (100 frames). MyoTracker was 74 times faster during inference than CoTracker2 and 11 times faster than EchoTracker with our setup. Maintaining the entire sequence in the temporal context was the greatest contributor to MyoTracker's accuracy. Slight additional gains can be made by re-enabling iterative refinement, at the cost of longer processing time.
Abstract:One of the main challenges in current research on segmentation in cardiac ultrasound is the lack of large and varied labeled datasets and the differences in annotation conventions between datasets. This makes it difficult to design robust segmentation models that generalize well to external datasets. This work utilizes diffusion models to create generative augmentations that can significantly improve diversity of the dataset and thus the generalisability of segmentation models without the need for more annotated data. The augmentations are applied in addition to regular augmentations. A visual test survey showed that experts cannot clearly distinguish between real and fully generated images. Using the proposed generative augmentations, segmentation robustness was increased when training on an internal dataset and testing on an external dataset with an improvement of over 20 millimeters in Hausdorff distance. Additionally, the limits of agreement for automatic ejection fraction estimation improved by up to 20% of absolute ejection fraction value on out of distribution cases. These improvements come exclusively from the increased variation of the training data using the generative augmentations, without modifying the underlying machine learning model. The augmentation tool is available as an open source Python library at https://github.com/GillesVanDeVyver/EchoGAINS.
Abstract:Tissue tracking in echocardiography is challenging due to the complex cardiac motion and the inherent nature of ultrasound acquisitions. Although optical flow methods are considered state-of-the-art (SOTA), they struggle with long-range tracking, noise occlusions, and drift throughout the cardiac cycle. Recently, novel learning-based point tracking techniques have been introduced to tackle some of these issues. In this paper, we build upon these techniques and introduce EchoTracker, a two-fold coarse-to-fine model that facilitates the tracking of queried points on a tissue surface across ultrasound image sequences. The architecture contains a preliminary coarse initialization of the trajectories, followed by reinforcement iterations based on fine-grained appearance changes. It is efficient, light, and can run on mid-range GPUs. Experiments demonstrate that the model outperforms SOTA methods, with an average position accuracy of 67% and a median trajectory error of 2.86 pixels. Furthermore, we show a relative improvement of 25% when using our model to calculate the global longitudinal strain (GLS) in a clinical test-retest dataset compared to other methods. This implies that learning-based point tracking can potentially improve performance and yield a higher diagnostic and prognostic value for clinical measurements than current techniques. Our source code is available at: https://github.com/riponazad/echotracker/.




Abstract:Cardiac valve event timing plays a crucial role when conducting clinical measurements using echocardiography. However, established automated approaches are limited by the need of external electrocardiogram sensors, and manual measurements often rely on timing from different cardiac cycles. Recent methods have applied deep learning to cardiac timing, but they have mainly been restricted to only detecting two key time points, namely end-diastole (ED) and end-systole (ES). In this work, we propose a deep learning approach that leverages triplane recordings to enhance detection of valve events in echocardiography. Our method demonstrates improved performance detecting six different events, including valve events conventionally associated with ED and ES. Of all events, we achieve an average absolute frame difference (aFD) of maximum 1.4 frames (29 ms) for start of diastasis, down to 0.6 frames (12 ms) for mitral valve opening when performing a ten-fold cross-validation with test splits on triplane data from 240 patients. On an external independent test consisting of apical long-axis data from 180 other patients, the worst performing event detection had an aFD of 1.8 (30 ms). The proposed approach has the potential to significantly impact clinical practice by enabling more accurate, rapid and comprehensive event detection, leading to improved clinical measurements.