Sentiment analysis using Electroencephalography (EEG) sensor signals provides a deeper behavioral understanding of a person's emotional state, offering insights into real-time mood fluctuations. This approach takes advantage of brain electrical activity, making it a promising tool for various applications, including mental health monitoring, affective computing, and personalised user experiences. An encoder-based model for EEG-to-sentiment analysis, utilizing the ZUCO 2.0 dataset and incorporating a Feature Pyramid Network (FPN), is proposed to enhance this process. FPNs are adapted here for EEG sensor data, enabling multiscale feature extraction to capture local and global sentiment-related patterns. The raw EEG sensor data from the ZUCO 2.0 dataset is pre-processed and passed through the FPN, which extracts hierarchical features. In addition, extracted features are passed to a Gated Recurrent Unit (GRU) to model temporal dependencies, thereby enhancing the accuracy of sentiment classification. The ZUCO 2.0 dataset is utilized for its clear and detailed representation in 128 channels, offering rich spatial and temporal resolution. The experimental metric results show that the proposed architecture achieves a 6.88\% performance gain compared to the existing methods. Furthermore, the proposed framework demonstrated its efficacy on the validation datasets DEAP and SEED.
Human emotions are difficult to convey through words and are often abstracted in the process; however, electroencephalogram (EEG) signals can offer a more direct lens into emotional brain activity. Recent studies show that deep learning models can process these signals to perform emotion recognition with high accuracy. However, many existing approaches overlook the dynamic interplay between distinct brain regions, which can be crucial to understanding how emotions unfold and evolve over time, potentially aiding in more accurate emotion recognition. To address this, we propose RBTransformer, a Transformer-based neural network architecture that models inter-cortical neural dynamics of the brain in latent space to better capture structured neural interactions for effective EEG-based emotion recognition. First, the EEG signals are converted into Band Differential Entropy (BDE) tokens, which are then passed through Electrode Identity embeddings to retain spatial provenance. These tokens are processed through successive inter-cortical multi-head attention blocks that construct an electrode x electrode attention matrix, allowing the model to learn the inter-cortical neural dependencies. The resulting features are then passed through a classification head to obtain the final prediction. We conducted extensive experiments, specifically under subject-dependent settings, on the SEED, DEAP, and DREAMER datasets, over all three dimensions, Valence, Arousal, and Dominance (for DEAP and DREAMER), under both binary and multi-class classification settings. The results demonstrate that the proposed RBTransformer outperforms all previous state-of-the-art methods across all three datasets, over all three dimensions under both classification settings. The source code is available at: https://github.com/nnilayy/RBTransformer.
EEG-based emotion recognition is hampered by profound dataset heterogeneity (channel/subject variability), hindering generalizable models. Existing approaches struggle to transfer knowledge effectively. We propose 'One Model for All', a universal pre-training framework for EEG analysis across disparate datasets. Our paradigm decouples learning into two stages: (1) Univariate pre-training via self-supervised contrastive learning on individual channels, enabled by a Unified Channel Schema (UCS) that leverages the channel union (e.g., SEED-62ch, DEAP-32ch); (2) Multivariate fine-tuning with a novel 'ART' (Adaptive Resampling Transformer) and 'GAT' (Graph Attention Network) architecture to capture complex spatio-temporal dependencies. Experiments show universal pre-training is an essential stabilizer, preventing collapse on SEED (vs. scratch) and yielding substantial gains on DEAP (+7.65%) and DREAMER (+3.55%). Our framework achieves new SOTA performance on all within-subject benchmarks: SEED (99.27%), DEAP (93.69%), and DREAMER (93.93%). We also show SOTA cross-dataset transfer, achieving 94.08% (intersection) and 93.05% (UCS) on the unseen DREAMER dataset, with the former surpassing the within-domain pre-training benchmark. Ablation studies validate our architecture: the GAT module is critical, yielding a +22.19% gain over GCN on the high-noise DEAP dataset, and its removal causes a catastrophic -16.44% performance drop. This work paves the way for more universal, scalable, and effective pre-trained models for diverse EEG analysis tasks.




Accurately predicting emotions from brain signals has the potential to achieve goals such as improving mental health, human-computer interaction, and affective computing. Emotion prediction through neural signals offers a promising alternative to traditional methods, such as self-assessment and facial expression analysis, which can be subjective or ambiguous. Measurements of the brain activity via electroencephalogram (EEG) provides a more direct and unbiased data source. However, conducting a full EEG is a complex, resource-intensive process, leading to the rise of low-cost EEG devices with simplified measurement capabilities. This work examines how subsets of EEG channels from the DEAP dataset can be used for sufficiently accurate emotion prediction with low-cost EEG devices, rather than fully equipped EEG-measurements. Using Continuous Wavelet Transformation to convert EEG data into scaleograms, we trained a vision transformer (ViT) model for emotion classification. The model achieved over 91,57% accuracy in predicting 4 quadrants (high/low per arousal and valence) with only 12 measuring points (also referred to as channels). Our work shows clearly, that a significant reduction of input channels yields high results compared to state-of-the-art results of 96,9% with 32 channels. Training scripts to reproduce our code can be found here: https://gitlab.kit.edu/kit/aifb/ATKS/public/AutoSMiLeS/DEAP-DIVE.




Electroencephalogram (EEG)-based emotion recognition holds significant value in affective computing and brain-computer interfaces. However, in practical applications, EEG recordings are susceptible to the effects of various physiological artifacts. Current approaches typically treat denoising and emotion recognition as independent tasks using cascaded architectures, which not only leads to error accumulation, but also fails to exploit potential synergies between these tasks. Moreover, conventional EEG-based emotion recognition models often rely on the idealized assumption of "perfectly denoised data", lacking a systematic design for noise robustness. To address these challenges, a novel framework that deeply couples denoising and emotion recognition tasks is proposed for end-to-end noise-robust emotion recognition, termed as Feedback-Driven Collaborative Network for Denoising-Classification Nexus (FDC-Net). Our primary innovation lies in establishing a dynamic collaborative mechanism between artifact removal and emotion recognition through: (1) bidirectional gradient propagation with joint optimization strategies; (2) a gated attention mechanism integrated with frequency-adaptive Transformer using learnable band-position encoding. Two most popular EEG-based emotion datasets (DEAP and DREAMER) with multi-dimensional emotional labels were employed to compare the artifact removal and emotion recognition performance between ASLSL and nine state-of-the-art methods. In terms of the denoising task, FDC-Net obtains a maximum correlation coefficient (CC) value of 96.30% on DEAP and a maximum CC value of 90.31% on DREAMER. In terms of the emotion recognition task under physiological artifact interference, FDC-Net achieves emotion recognition accuracies of 82.3+7.1% on DEAP and 88.1+0.8% on DREAMER.




Mental stress has become a pervasive factor affecting cognitive health and overall well-being, necessitating the development of robust, non-invasive diagnostic tools. Electroencephalogram (EEG) signals provide a direct window into neural activity, yet their non-stationary and high-dimensional nature poses significant modeling challenges. Here we introduce Brain2Vec, a new deep learning tool that classifies stress states from raw EEG recordings using a hybrid architecture of convolutional, recurrent, and attention mechanisms. The model begins with a series of convolutional layers to capture localized spatial dependencies, followed by an LSTM layer to model sequential temporal patterns, and concludes with an attention mechanism to emphasize informative temporal regions. We evaluate Brain2Vec on the DEAP dataset, applying bandpass filtering, z-score normalization, and epoch segmentation as part of a comprehensive preprocessing pipeline. Compared to traditional CNN-LSTM baselines, our proposed model achieves an AUC score of 0.68 and a validation accuracy of 81.25%. These findings demonstrate Brain2Vec's potential for integration into wearable stress monitoring platforms and personalized healthcare systems.
Electroencephalography (EEG) signals provide a promising and involuntary reflection of brain activity related to emotional states, offering significant advantages over behavioral cues like facial expressions. However, EEG signals are often noisy, affected by artifacts, and vary across individuals, complicating emotion recognition. While multimodal approaches have used Peripheral Physiological Signals (PPS) like GSR to complement EEG, they often overlook the dynamic synchronization and consistent semantics between the modalities. Additionally, the temporal dynamics of emotional fluctuations across different time resolutions in PPS remain underexplored. To address these challenges, we propose PhysioSync, a novel pre-training framework leveraging temporal and cross-modal contrastive learning, inspired by physiological synchronization phenomena. PhysioSync incorporates Cross-Modal Consistency Alignment (CM-CA) to model dynamic relationships between EEG and complementary PPS, enabling emotion-related synchronizations across modalities. Besides, it introduces Long- and Short-Term Temporal Contrastive Learning (LS-TCL) to capture emotional synchronization at different temporal resolutions within modalities. After pre-training, cross-resolution and cross-modal features are hierarchically fused and fine-tuned to enhance emotion recognition. Experiments on DEAP and DREAMER datasets demonstrate PhysioSync's advanced performance under uni-modal and cross-modal conditions, highlighting its effectiveness for EEG-centered emotion recognition.
EEG-based neural networks, pivotal in medical diagnosis and brain-computer interfaces, face significant intellectual property (IP) risks due to their reliance on sensitive neurophysiological data and resource-intensive development. Current watermarking methods, particularly those using abstract trigger sets, lack robust authentication and fail to address the unique challenges of EEG models. This paper introduces a cryptographic wonder filter-based watermarking framework tailored for EEG-based neural networks. Leveraging collision-resistant hashing and public-key encryption, the wonder filter embeds the watermark during training, ensuring minimal distortion ($\leq 5\%$ drop in EEG task accuracy) and high reliability (100\% watermark detection). The framework is rigorously evaluated against adversarial attacks, including fine-tuning, transfer learning, and neuron pruning. Results demonstrate persistent watermark retention, with classification accuracy for watermarked states remaining above 90\% even after aggressive pruning, while primary task performance degrades faster, deterring removal attempts. Piracy resistance is validated by the inability to embed secondary watermarks without severe accuracy loss ( $>10\%$ in EEGNet and CCNN models). Cryptographic hashing ensures authentication, reducing brute-force attack success probabilities. Evaluated on the DEAP dataset across models (CCNN, EEGNet, TSception), the method achieves $>99.4\%$ null-embedding accuracy, effectively eliminating false positives. By integrating wonder filters with EEG-specific adaptations, this work bridges a critical gap in IP protection for neurophysiological models, offering a secure, tamper-proof solution for healthcare and biometric applications. The framework's robustness against adversarial modifications underscores its potential to safeguard sensitive EEG models while maintaining diagnostic utility.
Emotion recognition has significant potential in healthcare and affect-sensitive systems such as brain-computer interfaces (BCIs). However, challenges such as the high cost of labeled data and variability in electroencephalogram (EEG) signals across individuals limit the applicability of EEG-based emotion recognition models across domains. These challenges are exacerbated in cross-dataset scenarios due to differences in subject demographics, recording devices, and presented stimuli. To address these issues, we propose a novel approach to improve cross-domain EEG-based emotion classification. Our method, Gradual Proximity-guided Target Data Selection (GPTDS), incrementally selects reliable target domain samples for training. By evaluating their proximity to source clusters and the models confidence in predicting them, GPTDS minimizes negative transfer caused by noisy and diverse samples. Additionally, we introduce Prediction Confidence-aware Test-Time Augmentation (PC-TTA), a cost-effective augmentation technique. Unlike traditional TTA methods, which are computationally intensive, PC-TTA activates only when model confidence is low, improving inference performance while drastically reducing computational costs. Experiments on the DEAP and SEED datasets validate the effectiveness of our approach. When trained on DEAP and tested on SEED, our model achieves 67.44% accuracy, a 7.09% improvement over the baseline. Conversely, training on SEED and testing on DEAP yields 59.68% accuracy, a 6.07% improvement. Furthermore, PC-TTA reduces computational time by a factor of 15 compared to traditional TTA methods. Our method excels in detecting both positive and negative emotions, demonstrating its practical utility in healthcare applications. Code available at: https://github.com/RyersonMultimediaLab/EmotionRecognitionUDA
EEG-based emotion recognition holds significant potential in the field of brain-computer interfaces. A key challenge lies in extracting discriminative spatiotemporal features from electroencephalogram (EEG) signals. Existing studies often rely on domain-specific time-frequency features and analyze temporal dependencies and spatial characteristics separately, neglecting the interaction between local-global relationships and spatiotemporal dynamics. To address this, we propose a novel network called Multi-Scale Inverted Mamba (MS-iMamba), which consists of Multi-Scale Temporal Blocks (MSTB) and Temporal-Spatial Fusion Blocks (TSFB). Specifically, MSTBs are designed to capture both local details and global temporal dependencies across different scale subsequences. The TSFBs, implemented with an inverted Mamba structure, focus on the interaction between dynamic temporal dependencies and spatial characteristics. The primary advantage of MS-iMamba lies in its ability to leverage reconstructed multi-scale EEG sequences, exploiting the interaction between temporal and spatial features without the need for domain-specific time-frequency feature extraction. Experimental results on the DEAP, DREAMER, and SEED datasets demonstrate that MS-iMamba achieves classification accuracies of 94.86%, 94.94%, and 91.36%, respectively, using only four-channel EEG signals, outperforming state-of-the-art methods.