Abstract:Patient status, angiographic and procedural characteristics encode crucial signals for predicting long-term outcomes after percutaneous coronary intervention (PCI). The aim of the study was to develop a predictive model for assessing the risk of cardiac death based on the real and synthetic data of patients undergoing PCI and to identify the factors that have the greatest impact on mortality. We analyzed 2,044 patients, who underwent a PCI for bifurcation lesions. The primary outcome was cardiac death at 3-year follow-up. Several machine learning models were applied to predict three-year mortality after PCI. To address class imbalance and improve the representation of the minority class, an additional 500 synthetic samples were generated and added to the training set. To evaluate the contribution of individual features to model performance, we applied permutation feature importance. An additional experiment was conducted to evaluate how the model's predictions would change after removing non-informative features from the training and test datasets. Without oversampling, all models achieve high overall accuracy (0.92-0.93), yet they almost completely ignore the minority class. Across models, augmentation consistently increases minority-class recall with minimal loss of AUROC, improves probability quality, and yields more clinically reasonable risk estimates on the constructed severe profiles. According to feature importance analysis, four features emerged as the most influential: Age, Ejection Fraction, Peripheral Artery Disease, and Cerebrovascular Disease. These results show that straightforward augmentation with realistic and extreme cases can expose, quantify, and reduce brittleness in imbalanced clinical prediction using only tabular records, and motivate routine reporting of probability quality and stress tests alongside headline metrics.




Abstract:The SYNTAX score has become a widely used measure of coronary disease severity , crucial in selecting the optimal mode of revascularization. This paper introduces a new medical regression and classification problem - automatically estimating SYNTAX score from coronary angiography. Our study presents a comprehensive dataset of 1,844 patients, featuring a balanced distribution of individuals with zero and non-zero scores. This dataset includes a first-of-its-kind, complete coronary angiography samples captured through a multi-view X-ray video, allowing one to observe coronary arteries from multiple perspectives. Furthermore, we present a novel, fully automatic end-to-end method for estimating the SYNTAX. For such a difficult task, we have achieved a solid coefficient of determination R2 of 0.51 in score predictions.




Abstract:Background. Cardiac dominance classification is essential for SYNTAX score estimation, which is a tool used to determine the complexity of coronary artery disease and guide patient selection toward optimal revascularization strategy. Objectives. Cardiac dominance classification algorithm based on the analysis of right coronary artery (RCA) angiograms using neural network Method. We employed convolutional neural network ConvNext and Swin transformer for 2D image (frames) classification, along with a majority vote for cardio angiographic view classification. An auxiliary network was also used to detect irrelevant images which were then excluded from the data set. Our data set consisted of 828 angiographic studies, 192 of them being patients with left dominance. Results. 5-fold cross validation gave the following dominance classification metrics (p=95%): macro recall=93.1%, accuracy=93.5%, macro F1=89.2%. The most common case in which the model regularly failed was RCA occlusion, as it requires utilization of LCA information. Another cause for false prediction is a small diameter combined with poor quality cardio angiographic view. In such cases, cardiac dominance classification can be complex and may require discussion among specialists to reach an accurate conclusion. Conclusion. The use of machine learning approaches to classify cardiac dominance based on RCA alone has been shown to be successful with satisfactory accuracy. However, for higher accuracy, it is necessary to utilize LCA information in the case of an occluded RCA and detect cases where there is high uncertainty.