Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Anli Ji

Physics-Guided Counterfactual Explanations for Large-Scale Multivariate Time Series: Application in Scalable and Interpretable SEP Event Prediction

Jan 13, 2026

Pranjal Patil, Anli Ji, Berkay Aydin

Abstract:Accurate prediction of solar energetic particle events is vital for safeguarding satellites, astronauts, and space-based infrastructure. Modern space weather monitoring generates massive volumes of high-frequency, multivariate time series (MVTS) data from sources such as the Geostationary perational Environmental Satellites (GOES). Machine learning (ML) models trained on this data show strong predictive power, but most existing methods overlook domain-specific feasibility constraints. Counterfactual explanations have emerged as a key tool for improving model interpretability, yet existing approaches rarely enforce physical plausibility. This work introduces a Physics-Guided Counterfactual Explanation framework, a novel method for generating counterfactual explanations in time series classification tasks that remain consistent with underlying physical principles. Applied to solar energetic particles (SEP) forecasting, this framework achieves over 80% reduction in Dynamic Time Warping (DTW) distance increasing the proximity, produces counterfactual explanations with higher sparsity, and reduces runtime by nearly 50% compared to state-of-the-art baselines such as DiCE. Beyond numerical improvements, this framework ensures that generated counterfactual explanations are physically plausible and actionable in scientific domains. In summary, the framework generates counterfactual explanations that are both valid and physically consistent, while laying the foundation for scalable counterfactual generation in big data environments.

* This is a pre-print of an accepted paper at IEEE BigData 2025, SS 11:Towards an Understanding of Artificial Intelligence: Bridging Theory, Explainability, and Practical Applications

Via

Access Paper or Ask Questions

Enhancing Explainability in Solar Energetic Particle Event Prediction: A Global Feature Mapping Approach

Nov 12, 2025

Anli Ji, Pranjal Patil, Chetraj Pandey, Manolis K. Georgoulis, Berkay Aydin

Abstract:Solar energetic particle (SEP) events, as one of the most prominent manifestations of solar activity, can generate severe hazardous radiation when accelerated by solar flares or shock waves formed aside from coronal mass ejections (CMEs). However, most existing data-driven methods used for SEP predictions are operated as black-box models, making it challenging for solar physicists to interpret the results and understand the underlying physical causes of such events rather than just obtain a prediction. To address this challenge, we propose a novel framework that integrates global explanations and ad-hoc feature mapping to enhance model transparency and provide deeper insights into the decision-making process. We validate our approach using a dataset of 341 SEP events, including 244 significant (>=10 MeV) proton events exceeding the Space Weather Prediction Center S1 threshold, spanning solar cycles 22, 23, and 24. Furthermore, we present an explainability-focused case study of major SEP events, demonstrating how our method improves explainability and facilitates a more physics-informed understanding of SEP event prediction.

* 10 pages, 3 Figures. This is a pre-print of an accepted paper at ICDMW: SABID 2025

Via

Access Paper or Ask Questions

Towards Hybrid Embedded Feature Selection and Classification Approach with Slim-TSF

Sep 06, 2024

Anli Ji, Chetraj Pandey, Berkay Aydin

Abstract:Traditional solar flare forecasting approaches have mostly relied on physics-based or data-driven models using solar magnetograms, treating flare predictions as a point-in-time classification problem. This approach has limitations, particularly in capturing the evolving nature of solar activity. Recognizing the limitations of traditional flare forecasting approaches, our research aims to uncover hidden relationships and the evolutionary characteristics of solar flares and their source regions. Our previously proposed Sliding Window Multivariate Time Series Forest (Slim-TSF) has shown the feasibility of usage applied on multivariate time series data. A significant aspect of this study is the comparative analysis of our updated Slim-TSF framework against the original model outcomes. Preliminary findings indicate a notable improvement, with an average increase of 5\% in both the True Skill Statistic (TSS) and Heidke Skill Score (HSS). This enhancement not only underscores the effectiveness of our refined methodology but also suggests that our systematic evaluation and feature selection approach can significantly advance the predictive accuracy of solar flare forecasting models.

* This is a preprint accepted at the 26th International Conference on Big Data Analytics and Knowledge Discovery (DAWAK 2024)

Via

Access Paper or Ask Questions

Embedding Ordinality to Binary Loss Function for Improving Solar Flare Forecasting

Aug 21, 2024

Chetraj Pandey, Anli Ji, Jinsu Hong, Rafal A. Angryk, Berkay Aydin

Figure 1 for Embedding Ordinality to Binary Loss Function for Improving Solar Flare Forecasting

Figure 2 for Embedding Ordinality to Binary Loss Function for Improving Solar Flare Forecasting

Figure 3 for Embedding Ordinality to Binary Loss Function for Improving Solar Flare Forecasting

Figure 4 for Embedding Ordinality to Binary Loss Function for Improving Solar Flare Forecasting

Abstract:In this paper, we propose a novel loss function aimed at optimizing the binary flare prediction problem by embedding the intrinsic ordinal flare characteristics into the binary cross-entropy (BCE) loss function. This modification is intended to provide the model with better guidance based on the ordinal characteristics of the data and improve the overall performance of the models. For our experiments, we employ a ResNet34-based model with transfer learning to predict $\geq$M-class flares by utilizing the shape-based features of magnetograms of active region (AR) patches spanning from $-$90$^{\circ}$ to $+$90$^{\circ}$ of solar longitude as our input data. We use a composite skill score (CSS) as our evaluation metric, which is calculated as the geometric mean of the True Skill Score (TSS) and the Heidke Skill Score (HSS) to rank and compare our models' performance. The primary contributions of this work are as follows: (i) We introduce a novel approach to encode ordinality into a binary loss function showing an application to solar flare prediction, (ii) We enhance solar flare forecasting by enabling flare predictions for each AR across the entire solar disk, without any longitudinal restrictions, and evaluate and compare performance. (iii) Our candidate model, optimized with the proposed loss function, shows an improvement of $\sim$7%, $\sim$4%, and $\sim$3% for AR patches within $\pm$30$^\circ$, $\pm$60$^\circ$, and $\pm$90$^\circ$ of solar longitude, respectively in terms of CSS, when compared with standard BCE. Additionally, we demonstrate the ability to issue flare forecasts for ARs in near-limb regions (regions between $\pm$60$^{\circ}$ to $\pm$90$^{\circ}$) with a CSS=0.34 (TSS=0.50 and HSS=0.23), expanding the scope of AR-based models for solar flare prediction. This advances the reliability of solar flare forecasts, leading to more effective prediction capabilities.

* 10 Pages, 8 Figures. This manuscript is accepted to be published at DSAA 2024 conference. arXiv admin note: substantial text overlap with arXiv:2406.11054

Via

Access Paper or Ask Questions

Active Region-based Flare Forecasting with Sliding Window Multivariate Time Series Forest Classifiers

Feb 05, 2024

Anli Ji, Berkay Aydin

Figure 1 for Active Region-based Flare Forecasting with Sliding Window Multivariate Time Series Forest Classifiers

Figure 2 for Active Region-based Flare Forecasting with Sliding Window Multivariate Time Series Forest Classifiers

Figure 3 for Active Region-based Flare Forecasting with Sliding Window Multivariate Time Series Forest Classifiers

Figure 4 for Active Region-based Flare Forecasting with Sliding Window Multivariate Time Series Forest Classifiers

Abstract:Over the past few decades, many applications of physics-based simulations and data-driven techniques (including machine learning and deep learning) have emerged to analyze and predict solar flares. These approaches are pivotal in understanding the dynamics of solar flares, primarily aiming to forecast these events and minimize potential risks they may pose to Earth. Although current methods have made significant progress, there are still limitations to these data-driven approaches. One prominent drawback is the lack of consideration for the temporal evolution characteristics in the active regions from which these flares originate. This oversight hinders the ability of these methods to grasp the relationships between high-dimensional active region features, thereby limiting their usability in operations. This study centers on the development of interpretable classifiers for multivariate time series and the demonstration of a novel feature ranking method with sliding window-based sub-interval ranking. The primary contribution of our work is to bridge the gap between complex, less understandable black-box models used for high-dimensional data and the exploration of relevant sub-intervals from multivariate time series, specifically in the context of solar flare forecasting. Our findings demonstrate that our sliding-window time series forest classifier performs effectively in solar flare prediction (with a True Skill Statistic of over 85\%) while also pinpointing the most crucial features and sub-intervals for a given learning task.

Via

Access Paper or Ask Questions

Towards Interpretable Solar Flare Prediction with Attention-based Deep Neural Networks

Sep 08, 2023

Chetraj Pandey, Anli Ji, Rafal A. Angryk, Berkay Aydin

Abstract:Solar flare prediction is a central problem in space weather forecasting and recent developments in machine learning and deep learning accelerated the adoption of complex models for data-driven solar flare forecasting. In this work, we developed an attention-based deep learning model as an improvement over the standard convolutional neural network (CNN) pipeline to perform full-disk binary flare predictions for the occurrence of $\geq$M1.0-class flares within the next 24 hours. For this task, we collected compressed images created from full-disk line-of-sight (LoS) magnetograms. We used data-augmented oversampling to address the class imbalance issue and used true skill statistic (TSS) and Heidke skill score (HSS) as the evaluation metrics. Furthermore, we interpreted our model by overlaying attention maps on input magnetograms and visualized the important regions focused on by the model that led to the eventual decision. The significant findings of this study are: (i) We successfully implemented an attention-based full-disk flare predictor ready for operational forecasting where the candidate model achieves an average TSS=0.54$\pm$0.03 and HSS=0.37$\pm$0.07. (ii) we demonstrated that our full-disk model can learn conspicuous features corresponding to active regions from full-disk magnetogram images, and (iii) our experimental evaluation suggests that our model can predict near-limb flares with adept skill and the predictions are based on relevant active regions (ARs) or AR characteristics from full-disk magnetograms.

* This is a preprint accepted at the 6th International Conference on Artificial Intelligence and Knowledge Engineering (AIKE), 2023. 8 pages, 6 figures

Via

Access Paper or Ask Questions

Exploring Deep Learning for Full-disk Solar Flare Prediction with Empirical Insights from Guided Grad-CAM Explanations

Aug 30, 2023

Chetraj Pandey, Anli Ji, Trisha Nandakumar, Rafal A. Angryk, Berkay Aydin

Figure 1 for Exploring Deep Learning for Full-disk Solar Flare Prediction with Empirical Insights from Guided Grad-CAM Explanations

Figure 2 for Exploring Deep Learning for Full-disk Solar Flare Prediction with Empirical Insights from Guided Grad-CAM Explanations

Figure 3 for Exploring Deep Learning for Full-disk Solar Flare Prediction with Empirical Insights from Guided Grad-CAM Explanations

Figure 4 for Exploring Deep Learning for Full-disk Solar Flare Prediction with Empirical Insights from Guided Grad-CAM Explanations

Abstract:This study progresses solar flare prediction research by presenting a full-disk deep-learning model to forecast $\geq$M-class solar flares and evaluating its efficacy on both central (within $\pm$70$^\circ$) and near-limb (beyond $\pm$70$^\circ$) events, showcasing qualitative assessment of post hoc explanations for the model's predictions, and providing empirical findings from human-centered quantitative assessments of these explanations. Our model is trained using hourly full-disk line-of-sight magnetogram images to predict $\geq$M-class solar flares within the subsequent 24-hour prediction window. Additionally, we apply the Guided Gradient-weighted Class Activation Mapping (Guided Grad-CAM) attribution method to interpret our model's predictions and evaluate the explanations. Our analysis unveils that full-disk solar flare predictions correspond with active region characteristics. The following points represent the most important findings of our study: (1) Our deep learning models achieved an average true skill statistic (TSS) of $\sim$0.51 and a Heidke skill score (HSS) of $\sim$0.38, exhibiting skill to predict solar flares where for central locations the average recall is $\sim$0.75 (recall values for X- and M-class are 0.95 and 0.73 respectively) and for the near-limb flares the average recall is $\sim$0.52 (recall values for X- and M-class are 0.74 and 0.50 respectively); (2) qualitative examination of the model's explanations reveals that it discerns and leverages features linked to active regions in both central and near-limb locations within full-disk magnetograms to produce respective predictions. In essence, our models grasp the shape and texture-based properties of flaring active regions, even in proximity to limb areas -- a novel and essential capability with considerable significance for operational forecasting systems.

* This is a preprint accepted at the 10th IEEE International Conference On Data Science And Advanced Analytics (DSAA 2023). The conference proceedings will be published by the IEEE Xplore Digital Library with ISBN: 979-8-3503-4503-2. 10 pages, 6 figures

Via

Access Paper or Ask Questions

All-Clear Flare Prediction Using Interval-based Time Series Classifiers

May 03, 2021

Anli Ji, Berkay Aydin, Manolis K. Georgoulis, Rafal Angryk

Figure 1 for All-Clear Flare Prediction Using Interval-based Time Series Classifiers

Figure 2 for All-Clear Flare Prediction Using Interval-based Time Series Classifiers

Figure 3 for All-Clear Flare Prediction Using Interval-based Time Series Classifiers

Figure 4 for All-Clear Flare Prediction Using Interval-based Time Series Classifiers

Abstract:An all-clear flare prediction is a type of solar flare forecasting that puts more emphasis on predicting non-flaring instances (often relatively small flares and flare quiet regions) with high precision while still maintaining valuable predictive results. While many flare prediction studies do not address this problem directly, all-clear predictions can be useful in operational context. However, in all-clear predictions, finding the right balance between avoiding false negatives (misses) and reducing the false positives (false alarms) is often challenging. Our study focuses on training and testing a set of interval-based time series classifiers named Time Series Forest (TSF). These classifiers will be used towards building an all-clear flare prediction system by utilizing multivariate time series data. Throughout this paper, we demonstrate our data collection, predictive model building and evaluation processes, and compare our time series classification models with baselines using our benchmark datasets. Our results show that time series classifiers provide better forecasting results in terms of skill scores, precision and recall metrics, and they can be further improved for more precise all-clear forecasts by tuning model hyperparameters.

Via

Access Paper or Ask Questions