Abstract:Properties of ocular fixations and saccades are highly stochastic during many experimental tasks, and their statistics are often used as proxies for various aspects of cognition. Although distinguishing saccades from fixations is not trivial, experimentalists generally use common ad-hoc thresholds in detection algorithms. This neglects inter-task and inter-individual variability in oculomotor dynamics, and potentially biases the resulting statistics. In this article, we introduce and evaluate an adaptive method based on a Markovian approximation of eye-gaze dynamics, using saccades and fixations as states such that the optimal threshold minimizes state transitions. Applying this to three common threshold-based algorithms (velocity, angular velocity, and dispersion), we evaluate the overall accuracy against a multi-threshold benchmark as well as robustness to noise. We find that a velocity threshold achieves the highest baseline accuracy (90-93\%) across both free-viewing and visual search tasks. However, velocity-based methods degrade rapidly under noise when thresholds remain fixed, with accuracy falling below 20% at high noise levels. Adaptive threshold optimization via K-ratio minimization substantially improves performance under noisy conditions for all algorithms. Adaptive dispersion thresholds demonstrate superior noise robustness, maintaining accuracy above 81% even at extreme noise levels (σ = 50 px), though a precision-recall trade-off emerges that favors fixation detection at the expense of saccade identification. In addition to demonstrating our parsimonious adaptive thresholding method, these findings provide practical guidance for selecting and tuning classification algorithms based on data quality and analytical priorities.
Abstract:Accurate modeling of eye gaze dynamics is essential for advancement in human-computer interaction, neurological diagnostics, and cognitive research. Traditional generative models like Markov models often fail to capture the complex temporal dependencies and distributional nuance inherent in eye gaze trajectories data. This study introduces a GAN framework employing LSTM and CNN generators and discriminators to generate high-fidelity synthetic eye gaze velocity trajectories. We conducted a comprehensive evaluation of four GAN architectures: CNN-CNN, LSTM-CNN, CNN-LSTM, and LSTM-LSTM trained under two conditions: using only adversarial loss and using a weighted combination of adversarial and spectral losses. Our findings reveal that the LSTM-CNN architecture trained with this new loss function exhibits the closest alignment to the real data distribution, effectively capturing both the distribution tails and the intricate temporal dependencies. The inclusion of spectral regularization significantly enhances the GANs ability to replicate the spectral characteristics of eye gaze movements, leading to a more stable learning process and improved data fidelity. Comparative analysis with an HMM optimized to four hidden states further highlights the advantages of the LSTM-CNN GAN. Statistical metrics show that the HMM-generated data significantly diverges from the real data in terms of mean, standard deviation, skewness, and kurtosis. In contrast, the LSTM-CNN model closely matches the real data across these statistics, affirming its capacity to model the complexity of eye gaze dynamics effectively. These results position the spectrally regularized LSTM-CNN GAN as a robust tool for generating synthetic eye gaze velocity data with high fidelity.