Abstract:Deep learning models, especially convolutional neural networks (CNNs), have shown considerable promise for biomedical signals such as EEG-based seizure detection. However, these models come with challenges, primarily due to their size and compute requirements in environments where real-time detection or limited resources are available. In this study, we present a lightweight one-dimensional CNN model with structured pruning to improve efficiency and reliability. The model was trained with mild early stopping to address possible overfitting, achieving an accuracy of 92.78% and a macro-F1 score of 0.8686. Structured pruning of the baseline CNN involved removing 50% of the convolutional kernels based on their importance to model predictions. Surprisingly, after pruning the weights and memory by 50%, the new network was still able to maintain predictive capabilities, while modestly increasing precision to 92.87% and improving the macro-F1 score to 0.8707. Overall, we present a convincing case that structured pruning removes redundancy, improves generalization, and, in combination with mild early stopping, achieves a promising way forward to improve seizure detection efficiency and reliability, which is clear motivation for resource-limited settings.
Abstract:In reinforcement learning with human feedback (RLHF), reward models can efficiently learn and amplify latent biases within multimodal datasets, which can lead to imperfect policy optimization through flawed reward signals and decreased fairness. Bias mitigation studies have often applied passive constraints, which can fail under causal confounding. Here, we present a counterfactual reward model that introduces causal inference with multimodal representation learning to provide an unsupervised, bias-resilient reward signal. The heart of our contribution is the Counterfactual Trust Score, an aggregated score consisting of four components: (1) counterfactual shifts that decompose political framing bias from topical bias; (2) reconstruction uncertainty during counterfactual perturbations; (3) demonstrable violations of fairness rules for each protected attribute; and (4) temporal reward shifts aligned with dynamic trust measures. We evaluated the framework on a multimodal fake versus true news dataset, which exhibits framing bias, class imbalance, and distributional drift. Following methodologies similar to unsupervised drift detection from representation-based distances [1] and temporal robustness benchmarking in language models [2], we also inject synthetic bias across sequential batches to test robustness. The resulting system achieved an accuracy of 89.12% in fake news detection, outperforming the baseline reward models. More importantly, it reduced spurious correlations and unfair reinforcement signals. This pipeline outlines a robust and interpretable approach to fairness-aware RLHF, offering tunable bias reduction thresholds and increasing reliability in dynamic real-time policy making.
Abstract:In applied machine learning, concept drift, which is either gradual or abrupt changes in data distribution, can significantly reduce model performance. Typical detection methods,such as statistical tests or reconstruction-based models,are generally reactive and not very sensitive to early detection. Our study proposes a hybrid framework consisting of Transformers and Autoencoders to model complex temporal dynamics and provide online drift detection. We create a distinct Trust Score methodology, which includes signals on (1) statistical and reconstruction-based drift metrics, more specifically, PSI, JSD, Transformer-AE error, (2) prediction uncertainty, (3) rules violations, and (4) trend of classifier error aligned with the combined metrics defined by the Trust Score. Using a time sequenced airline passenger data set with synthetic drift, our proposed model allows for a better detection of drift using as a whole and at different detection thresholds for both sensitivity and interpretability compared to baseline methods and provides a strong pipeline for drift detection in real time for applied machine learning. We evaluated performance using a time-sequenced airline passenger dataset having the gradually injected stimulus of drift in expectations,e.g. permuted ticket prices in later batches, broken into 10 time segments [1].In the data, our results support that the Transformation-Autoencoder detected drift earlier and with more sensitivity than the autoencoders commonly used in the literature, and provided improved modeling over more error rates and logical violations. Therefore, a robust framework was developed to reliably monitor concept drift.