Abstract:Gait analysis provides an objective characterization of locomotor function and is widely used to support diagnosis and rehabilitation monitoring across neurological and orthopedic disorders. Deep learning has been increasingly applied to this domain, yet most approaches rely on supervised classifiers trained on disease-labeled data, limiting generalization to heterogeneous pathological presentations. This work proposes a label-free framework for joint-level anomaly detection and kinematic correction based on a Transformer masked autoencoder trained exclusively on normative gait sequences from 150 adults, acquired with a markerless multi-camera motion-capture system. At inference, a two-pass procedure is applied to potentially pathological input sequences, first it estimates joint inconsistency scores by occluding individual joints and measuring deviations from the learned normative prior. Then, it withholds the flagged joints from the encoder input and reconstructs the full skeleton from the remaining spatiotemporal context, yielding corrected kinematic trajectories at the flagged positions. Validation on 10 held-out normative participants, who mimicked seven simulated gait abnormalities, showed accurate localization of biomechanically inconsistent joints, a significant reduction in angular deviation across all analyzed joints with large effect sizes, and preservation of normative kinematics. The proposed approach enables interpretable, subject-specific localization of gait impairments without requiring disease labels. Video is available at https://youtu.be/Rcm3jqR5pN4.




Abstract:We introduce a novel state-space model (SSM)-based framework for skeleton-based human action recognition, with an anatomically-guided architecture that improves state-of-the-art performance in both clinical diagnostics and general action recognition tasks. Our approach decomposes skeletal motion analysis into spatial, temporal, and spatio-temporal streams, using channel partitioning to capture distinct movement characteristics efficiently. By implementing a structured, multi-directional scanning strategy within SSMs, our model captures local joint interactions and global motion patterns across multiple anatomical body parts. This anatomically-aware decomposition enhances the ability to identify subtle motion patterns critical in medical diagnosis, such as gait anomalies associated with neurological conditions. On public action recognition benchmarks, i.e., NTU RGB+D, NTU RGB+D 120, and NW-UCLA, our model outperforms current state-of-the-art methods, achieving accuracy improvements up to $3.2\%$ with lower computational complexity than previous leading transformer-based models. We also introduce a novel medical dataset for motion-based patient neurological disorder analysis to validate our method's potential in automated disease diagnosis.