Clinical electroencephalography is routinely used to evaluate patients with diverse and often overlapping neurological conditions, yet interpretation remains manual, time-intensive, and variable across experts. While automated EEG analysis has been widely studied, most existing methods target isolated diagnostic problems, particularly seizure detection, and provide limited support for multi-disorder clinical screening. This study examines automated EEG-based classification across eleven clinically relevant neurological disorder categories, encompassing acute time-critical conditions, chronic neurocognitive and developmental disorders, and disorders with indirect or weak electrophysiological signatures. EEG recordings are processed using a standard longitudinal bipolar montage and represented through a multi-domain feature set capturing temporal statistics, spectral structure, signal complexity, and inter-channel relationships. Disorder-aware machine learning models are trained under severe class imbalance, with decision thresholds explicitly calibrated to prioritize diagnostic sensitivity. Evaluation on a large, heterogeneous clinical EEG dataset demonstrates that sensitivity-oriented modeling achieves recall exceeding 80% for the majority of disorder categories, with several low-prevalence conditions showing absolute recall gains of 15-30% after threshold calibration compared to default operating points. Feature importance analysis reveals physiologically plausible patterns consistent with established clinical EEG markers. These results establish realistic performance baselines for multi-disorder EEG classification and provide quantitative evidence that sensitivity-prioritized automated analysis can support scalable EEG screening and triage in real-world clinical settings.
Objective: We develop a channel-adaptive (CA) architecture that seamlessly processes multi-variate time-series with an arbitrary number of channels, and in particular intracranial electroencephalography (iEEG) recordings. Methods: Our CA architecture first processes the iEEG signal using state-of-the-art models applied to each single channel independently. The resulting features are then fused using a vector-symbolic algorithm which reconstructs the spatial relationship using a trainable scalar per channel. Finally, the fused features are accumulated in a long-term memory of up to 2 minutes to perform the classification. Each CA-model can then be pre-trained on a large corpus of iEEG recordings from multiple heterogeneous subjects. The pre-trained model is personalized to each subject via a quick fine-tuning routine, which uses equal or lower amounts of data compared to existing state-of-the-art models, but requiring only 1/5 of the time. Results: We evaluate our CA-models on a seizure detection task both on a short-term (~20 hours) and a long-term (~2500 hours) dataset. In particular, our CA-EEGWaveNet is trained on a single seizure of the tested subject, while the baseline EEGWaveNet is trained on all but one. Even in this challenging scenario, our CA-EEGWaveNet surpasses the baseline in median F1-score (0.78 vs 0.76). Similarly, CA-EEGNet based on EEGNet, also surpasses its baseline in median F1-score (0.79 vs 0.74). Conclusion and significance: Our CA-model addresses two issues: first, it is channel-adaptive and can therefore be trained across heterogeneous subjects without loss of performance; second, it increases the effective temporal context size to a clinically-relevant length. Therefore, our model is a drop-in replacement for existing models, bringing better characteristics and performance across the board.
The manual labeling of events in electroencephalography (EEG) records is time-consuming. This is especially true when EEG recordings are taken continuously over weeks to months. Therefore, a method to automatically label pertinent EEG events reduces the manual workload. Spike wave discharges (SWD), which are the electrographic hallmark of absence seizures, are EEG events that are often labeled manually. While some previous studies have utilized machine learning to automatically segment and classify EEG signals like SWDs, they can be improved. Here we compare the performance of 14 machine learning classifiers on our own manually annotated dataset of 961 hours of EEG recordings from C3H/HeJ mice, including 22,637 labeled SWDs. We find that a 1D UNet performs best for labeling SWDs in this dataset. We also improve the 1D UNet by augmenting our training data and determine that scaling showed the greatest benefit of all augmentation procedures applied. We then compare the 1D UNet with data augmentation, AugUNet1D, against a recently published time- and frequency-based algorithmic approach called "Twin Peaks". AugUNet1D showed superior performance and detected events with more similar features to the SWDs labeled manually. AugUNet1D, pretrained on our manually annotated data or untrained, is made public for others users.
Explainability domain generalization and rare class reliability are critical challenges in medical AI where deep models often fail under real world distribution shifts and exhibit bias against infrequent clinical conditions This paper introduces XAIMeD an explainable medical AI framework that integrates clinically accurate expert knowledge into deep learning through a unified neuro symbolic architecture XAIMeD is designed to improve robustness under distribution shift enhance rare class sensitivity and deliver transparent clinically aligned interpretations The framework encodes clinical expertise as logical connectives over atomic medical propositions transforming them into machine checkable class specific rules Their diagnostic utility is quantified through weighted feature satisfaction scores enabling a symbolic reasoning branch that complements neural predictions A confidence weighted fusion integrates symbolic and deep outputs while a Hunt inspired adaptive routing mechanism guided by Entropy Imbalance Gain EIG and Rare Class Gini mitigates class imbalance high intra class variability and uncertainty We evaluate XAIMeD across diverse modalities on four challenging tasks i Seizure Onset Zone SOZ localization from rs fMRI ii Diabetic Retinopathy grading across 6 multicenter datasets demonstrate substantial performance improvements including 6 percent gains in cross domain generalization and a 10 percent improved rare class F1 score far outperforming state of the art deep learning baselines Ablation studies confirm that the clinically grounded symbolic components act as effective regularizers ensuring robustness to distribution shifts XAIMeD thus provides a principled clinically faithful and interpretable approach to multimodal medical AI.




Accurate yet interpretable image-based diagnosis remains a central challenge in medical AI, particularly in settings characterized by limited data, subtle visual cues, and high-stakes clinical decision-making. Most existing vision models rely on purely data-driven learning and produce black-box predictions with limited interpretability and poor cross-domain generalization, hindering their real-world clinical adoption. We present NEURO-GUARD, a novel knowledge-guided vision framework that integrates Vision Transformers (ViTs) with language-driven reasoning to improve performance, transparency, and domain robustness. NEURO-GUARD employs a retrieval-augmented generation (RAG) mechanism for self-verification, in which a large language model (LLM) iteratively generates, evaluates, and refines feature-extraction code for medical images. By grounding this process in clinical guidelines and expert knowledge, the framework progressively enhances feature detection and classification beyond purely data-driven baselines. Extensive experiments on diabetic retinopathy classification across four benchmark datasets APTOS, EyePACS, Messidor-1, and Messidor-2 demonstrate that NEURO-GUARD improves accuracy by 6.2% over a ViT-only baseline (84.69% vs. 78.4%) and achieves a 5% gain in domain generalization. Additional evaluations on MRI-based seizure detection further confirm its cross-domain robustness, consistently outperforming existing methods. Overall, NEURO-GUARD bridges symbolic medical reasoning with subsymbolic visual learning, enabling interpretable, knowledge-aware, and generalizable medical image diagnosis while achieving state-of-the-art performance across multiple datasets.
Many deep learning approaches have been developed for EEG-based seizure detection; however, most rely on access to large centralized annotated datasets. In clinical practice, EEG data are often scarce, patient-specific distributed across institutions, and governed by strict privacy regulations that prohibit data pooling. As a result, creating usable AI-based seizure detection models remains challenging in real-world medical settings. To address these constraints, we propose a two-stage federated few-shot learning (FFSL) framework for personalized EEG-based seizure detection. The method is trained and evaluated on the TUH Event Corpus, which includes six EEG event classes. In Stage 1, a pretrained biosignal transformer (BIOT) is fine-tuned across non-IID simulated hospital sites using federated learning, enabling shared representation learning without centralizing EEG recordings. In Stage 2, federated few-shot personalization adapts the classifier to each patient using only five labeled EEG segments, retaining seizure-specific information while still benefiting from cross-site knowledge. Federated fine-tuning achieved a balanced accuracy of 0.43 (centralized: 0.52), Cohen's kappa of 0.42 (0.49), and weighted F1 of 0.69 (0.74). In the FFSL stage, client-specific models reached an average balanced accuracy of 0.77, Cohen's kappa of 0.62, and weighted F1 of 0.73 across four sites with heterogeneous event distributions. These results suggest that FFSL can support effective patient-adaptive seizure detection under realistic data-availability and privacy constraints.
We introduce a curated video dataset of laboratory rodents for automatic detection of convulsive events. The dataset contains short (10~s) top-down and side-view video clips of individual rodents, labeled at clip level as normal activity or seizure. It includes 10,101 negative samples and 2,952 positive samples collected from 19 subjects. We describe the data curation, annotation protocol and preprocessing pipeline, and report baseline experiments using a transformer-based video classifier (TimeSformer). Experiments employ five-fold cross-validation with strict subject-wise partitioning to prevent data leakage (no subject appears in more than one fold). Results show that the TimeSformer architecture enables discrimination between seizure and normal activity with an average F1-score of 97%. The dataset and baseline code are publicly released to support reproducible research on non-invasive, video-based monitoring in preclinical epilepsy research. RodEpil Dataset access - DOI: 10.5281/zenodo.17601357
Neonates are highly susceptible to seizures, often leading to short or long-term neurological impairments. However, clinical manifestations of neonatal seizures are subtle and often lead to misdiagnoses. This increases the risk of prolonged, untreated seizure activity and subsequent brain injury. Continuous video electroencephalogram (cEEG) monitoring is the gold standard for seizure detection. However, this is an expensive evaluation that requires expertise and time. In this study, we propose a convolutional neural network-based model for early prediction of neonatal seizures by distinguishing between interictal and preictal states of the EEG. Our model is patient-independent, enabling generalization across multiple subjects, and utilizes mel-frequency cepstral coefficient matrices extracted from multichannel EEG and electrocardiogram (ECG) signals as input features. Trained and validated on the Helsinki neonatal EEG dataset with 10-fold cross-validation, the proposed model achieved an average accuracy of 97.52%, sensitivity of 98.31%, specificity of 96.39%, and F1-score of 97.95%, enabling accurate seizure prediction up to 30 minutes before onset. The inclusion of ECG alongside EEG improved the F1-score by 1.42%, while the incorporation of an attention mechanism yielded an additional 0.5% improvement. To enhance transparency, we incorporated SHapley Additive exPlanations (SHAP) as an explainable artificial intelligence method to interpret the model and provided localization of seizure focus using scalp plots. The overall results demonstrate the model's potential for minimally supervised deployment in neonatal intensive care units, enabling timely and reliable prediction of neonatal seizures, while demonstrating strong generalization capability across unseen subjects through transfer learning.
Self-supervised learning (SSL) offers a promising approach for learning electroencephalography (EEG) representations from unlabeled data, reducing the need for expensive annotations for clinical applications like sleep staging and seizure detection. While current EEG SSL methods predominantly use masked reconstruction strategies like masked autoencoders (MAE) that capture local temporal patterns, position prediction pretraining remains underexplored despite its potential to learn long-range dependencies in neural signals. We introduce PAirwise Relative Shift or PARS pretraining, a novel pretext task that predicts relative temporal shifts between randomly sampled EEG window pairs. Unlike reconstruction-based methods that focus on local pattern recovery, PARS encourages encoders to capture relative temporal composition and long-range dependencies inherent in neural signals. Through comprehensive evaluation on various EEG decoding tasks, we demonstrate that PARS-pretrained transformers consistently outperform existing pretraining strategies in label-efficient and transfer learning settings, establishing a new paradigm for self-supervised EEG representation learning.
Foundation models have transformed AI by reducing reliance on task-specific data through large-scale pretraining. While successful in language and vision, their adoption in EEG has lagged due to the heterogeneity of public datasets, which are collected under varying protocols, devices, and electrode configurations. Existing EEG foundation models struggle to generalize across these variations, often restricting pretraining to a single setup, resulting in suboptimal performance, in particular under linear probing. We present REVE (Representation for EEG with Versatile Embeddings), a pretrained model explicitly designed to generalize across diverse EEG signals. REVE introduces a novel 4D positional encoding scheme that enables it to process signals of arbitrary length and electrode arrangement. Using a masked autoencoding objective, we pretrain REVE on over 60,000 hours of EEG data from 92 datasets spanning 25,000 subjects, representing the largest EEG pretraining effort to date. REVE achieves state-of-the-art results on 10 downstream EEG tasks, including motor imagery classification, seizure detection, sleep staging, cognitive load estimation, and emotion recognition. With little to no fine-tuning, it demonstrates strong generalization, and nuanced spatio-temporal modeling. We release code, pretrained weights, and tutorials to support standardized EEG research and accelerate progress in clinical neuroscience.