Abstract:Autism Spectrum Disorder (ASD) is a neurodevelopmental disorder characterized by deficits in social communication and behavioral patterns. Eye movement data offers a non-invasive diagnostic tool for ASD detection, as it is inherently discrete and exhibits short-term temporal dependencies, reflecting localized gaze focus between fixation points. These characteristics enable the data to provide deeper insights into subtle behavioral markers, distinguishing ASD-related patterns from typical development. Eye movement signals mainly contain short-term and localized dependencies. However, despite the widespread application of stacked attention layers in Transformer-based models for capturing long-range dependencies, our experimental results indicate that this approach yields only limited benefits when applied to eye movement data. This may be because discrete fixation points and short-term dependencies in gaze focus reduce the utility of global attention mechanisms, making them less efficient than architectures focusing on local temporal patterns. To efficiently capture subtle and complex eye movement patterns, distinguishing ASD from typically developing (TD) individuals, a discrete short-term sequential (DSTS) modeling framework is designed with Class-aware Representation and Imbalance-aware Mechanisms. Through extensive experiments on several eye movement datasets, DSTS outperforms both traditional machine learning techniques and more sophisticated deep learning models.




Abstract:Categorical attributes with qualitative values are ubiquitous in cluster analysis of real datasets. Unlike the Euclidean distance of numerical attributes, the categorical attributes lack well-defined relationships of their possible values (also called categories interchangeably), which hampers the exploration of compact categorical data clusters. Although most attempts are made for developing appropriate distance metrics, they typically assume a fixed topological relationship between categories when learning distance metrics, which limits their adaptability to varying cluster structures and often leads to suboptimal clustering performance. This paper, therefore, breaks the intrinsic relationship tie of attribute categories and learns customized distance metrics suitable for flexibly and accurately revealing various cluster distributions. As a result, the fitting ability of the clustering algorithm is significantly enhanced, benefiting from the learnable category relationships. Moreover, the learned category relationships are proved to be Euclidean distance metric-compatible, enabling a seamless extension to mixed datasets that include both numerical and categorical attributes. Comparative experiments on 12 real benchmark datasets with significance tests show the superior clustering accuracy of the proposed method with an average ranking of 1.25, which is significantly higher than the 5.21 ranking of the current best-performing method.