Abstract:The geometric structure of latent representations in large language models (LLMs) is an active area of research, driven in part by its implications for model transparency and AI safety. Existing literature has focused mainly on general geometric and topological properties of the learnt representations, but due to a lack of ground-truth latent geometry, validating the findings of such approaches is challenging. Emotion processing provides an intriguing testbed for probing representational geometry, as emotions exhibit both categorical organization and continuous affective dimensions, which are well-established in the psychology literature. Moreover, understanding such representations carries safety relevance. In this work, we investigate the latent structure of affective representations in LLMs using geometric data analysis tools. We present three main findings. First, we show that LLMs learn coherent latent representations of affective emotions that align with widely used valence--arousal models from psychology. Second, we find that these representations exhibit nonlinear geometric structure that can nonetheless be well-approximated linearly, providing empirical support for the linear representation hypothesis commonly assumed in model transparency methods. Third, we demonstrate that the learned latent representation space can be leveraged to quantify uncertainty in emotion processing tasks. Our findings suggest that LLMs acquire affective representations with geometric structure paralleling established models of human emotion, with practical implications for model interpretability and safety.
Abstract:Electromyogenic (EMG) noise is a major contamination source in EEG data that can impede accurate analysis of brain-specific neural activity. Recent literature on EMG artifact removal has moved beyond traditional linear algorithms in favor of machine learning-based systems. However, existing deep learning-based filtration methods often have large compute footprints and prohibitively long training times. In this study, we present a new machine learning-based system for filtering EMG interference from EEG data using an autoencoder-targeted adversarial transformer (AT-AT). By leveraging the lightweight expressivity of an autoencoder to determine optimal time-series transformer application sites, our AT-AT architecture achieves a >90% model size reduction compared to published artifact removal models. The addition of adversarial training ensures that filtered signals adhere to the fundamental characteristics of EEG data. We trained AT-AT using published neural data from 67 subjects and found that the system was able to achieve comparable test performance to larger models; AT-AT posted a mean reconstructive correlation coefficient above 0.95 at an initial signal-to-noise ratio (SNR) of 2 dB and 0.70 at -7 dB SNR. Further research generalizing these results to broader sample sizes beyond these isolated test cases will be crucial; while outside the scope of this study, we also include results from a real-world deployment of AT-AT in the Appendix.
Abstract:Brain-computer interfaces (BCIs) offer transformative potential, but decoding neural signals presents significant challenges. The core premise of this paper is built around demonstrating methods to elucidate the underlying low-dimensional geometric structure present in high-dimensional brainwave data in order to assist in downstream BCI-related neural classification tasks. We demonstrate two pipelines related to electroencephalography (EEG) signal processing: (1) a preliminary pipeline removing noise from individual EEG channels, and (2) a downstream manifold learning pipeline uncovering geometric structure across networks of EEG channels. We conduct preliminary validation using two EEG datasets and situate our demonstration in the context of the BCI-relevant imagined digit decoding problem. Our preliminary pipeline uses an attention-based EEG filtration network to extract clean signal from individual EEG channels. Our primary pipeline uses a fast Fourier transform, a Laplacian eigenmap, a discrete analog of Ricci flow via Ollivier's notion of Ricci curvature, and a graph convolutional network to perform dimensionality reduction on high-dimensional multi-channel EEG data in order to enable regularizable downstream classification. Our system achieves competitive performance with existing signal processing and classification benchmarks; we demonstrate a mean test correlation coefficient of >0.95 at 2 dB on semi-synthetic neural denoising and a downstream EEG-based classification accuracy of 0.97 on distinguishing digit- versus non-digit thoughts. Results are preliminary and our geometric machine learning pipeline should be validated by more extensive follow-up studies; generalizing these results to larger inter-subject sample sizes, different hardware systems, and broader use cases will be crucial.




Abstract:Current machine learning (ML)-based algorithms for filtering electroencephalography (EEG) time series data face challenges related to cumbersome training times, regularization, and accurate reconstruction. To address these shortcomings, we present an ML filtration algorithm driven by a logistic covariance-targeted adversarial denoising autoencoder (TADA). We hypothesize that the expressivity of a targeted, correlation-driven convolutional autoencoder will enable effective time series filtration while minimizing compute requirements (e.g., runtime, model size). Furthermore, we expect that adversarial training with covariance rescaling will minimize signal degradation. To test this hypothesis, a TADA system prototype was trained and evaluated on the task of removing electromyographic (EMG) noise from EEG data in the EEGdenoiseNet dataset, which includes EMG and EEG data from 67 subjects. The TADA filter surpasses conventional signal filtration algorithms across quantitative metrics (Correlation Coefficient, Temporal RRMSE, Spectral RRMSE), and performs competitively against other deep learning architectures at a reduced model size of less than 400,000 trainable parameters. Further experimentation will be necessary to assess the viability of TADA on a wider range of deployment cases.