Abstract:Event cameras record luminance changes with microsecond resolution, but converting their sparse, asynchronous output into dense tensors that neural networks can exploit remains a core challenge. Conventional histograms or globally-decayed time-surface representations apply fixed temporal parameters across the entire image plane, which in practice creates a trade-off between preserving spatial structure during still periods and retaining sharp edges during rapid motion. We introduce Locally Adaptive Decay Surfaces (LADS), a family of event representations in which the temporal decay at each location is modulated according to local signal dynamics. Three strategies are explored, based on event rate, Laplacian-of-Gaussian response, and high-frequency spectral energy. These adaptive schemes preserve detail in quiescent regions while reducing blur in regions of dense activity. Extensive experiments on the public data show that LADS consistently improves both face detection and facial landmark accuracy compared to standard non-adaptive representations. At 30 Hz, LADS achieves higher detection accuracy and lower landmark error than either baseline, and at 240 Hz it mitigates the accuracy decline typically observed at higher frequencies, sustaining 2.44 % normalized mean error for landmarks and 0.966 mAP50 in face detection. These high-frequency results even surpass the accuracy reported in prior works operating at 30 Hz, setting new benchmarks for event-based face analysis. Moreover, by preserving spatial structure at the representation stage, LADS supports the use of much lighter network architectures while still retaining real-time performance. These results highlight the importance of context-aware temporal integration for neuromorphic vision and point toward real-time, high-frequency human-computer interaction systems that exploit the unique advantages of event cameras.
Abstract:Event camera-based driver monitoring is emerging as a pivotal area of research, driven by its significant advantages such as rapid response, low latency, power efficiency, enhanced privacy, and prevention of undersampling. Effective detection of driver distraction is crucial in driver monitoring systems to enhance road safety and reduce accident rates. The integration of an optimized sensor such as Event Camera with an optimized network is essential for maximizing these benefits. This paper introduces the innovative concept of sensing without seeing to detect driver distraction, leveraging computationally efficient spiking neural networks (SNN). To the best of our knowledge, this study is the first to utilize event camera data with spiking neural networks for driver distraction. The proposed Spiking-DD network not only achieve state of the art performance but also exhibit fewer parameters and provides greater accuracy than current event-based methodologies.




Abstract:Neuromorphic vision sensors, or event cameras, differ from conventional cameras in that they do not capture images at a specified rate. Instead, they asynchronously log local brightness changes at each pixel. As a result, event cameras only record changes in a given scene, and do so with very high temporal resolution, high dynamic range, and low power requirements. Recent research has demonstrated how these characteristics make event cameras extremely practical sensors in driver monitoring systems (DMS), enabling the tracking of high-speed eye motion and blinks. This research provides a proof of concept to expand event-based DMS techniques to include seatbelt state detection. Using an event simulator, a dataset of 108,691 synthetic neuromorphic frames of car occupants was generated from a near-infrared (NIR) dataset, and split into training, validation, and test sets for a seatbelt state detection algorithm based on a recurrent convolutional neural network (CNN). In addition, a smaller set of real event data was collected and reserved for testing. In a binary classification task, the fastened/unfastened frames were identified with an F1 score of 0.989 and 0.944 on the simulated and real test sets respectively. When the problem extended to also classify the action of fastening/unfastening the seatbelt, respective F1 scores of 0.964 and 0.846 were achieved.
Abstract:Driver monitoring systems (DMS) are a key component of vehicular safety and essential for the transition from semiautonomous to fully autonomous driving. A key task for DMS is to ascertain the cognitive state of a driver and to determine their level of tiredness. Neuromorphic vision systems, based on event camera technology, provide advanced sensing of facial characteristics, in particular the behavior of a driver's eyes. This research explores the potential to extend neuromorphic sensing techniques to analyze the entire facial region, detecting yawning behaviors that give a complimentary indicator of tiredness. A neuromorphic dataset is constructed from 952 video clips (481 yawns, 471 not-yawns) captured with an RGB color camera, with 37 subjects. A total of 95200 neuromorphic image frames are generated from this video data using a video-to-event converter. From these data 21 subjects were selected to provide a training dataset, 8 subjects were used for validation data, and the remaining 8 subjects were reserved for an "unseen" test dataset. An additional 12300 frames were generated from event simulations of a public dataset to test against other methods. A CNN with self-attention and a recurrent head was designed, trained, and tested with these data. Respective precision and recall scores of 95.9 percent and 94.7 percent were achieved on our test set, and 89.9 percent and 91 percent on the simulated public test set, demonstrating the feasibility to add yawn detection as a sensing component of a neuromorphic DMS.




Abstract:Event cameras contain emerging, neuromorphic vision sensors that capture local light intensity changes at each pixel, generating a stream of asynchronous events. This way of acquiring visual information constitutes a departure from traditional frame based cameras and offers several significant advantages: low energy consumption, high temporal resolution, high dynamic range and low latency. Driver monitoring systems (DMS) are in-cabin safety systems designed to sense and understand a drivers physical and cognitive state. Event cameras are particularly suited to DMS due to their inherent advantages. This paper proposes a novel method to simultaneously detect and track faces and eyes for driver monitoring. A unique, fully convolutional recurrent neural network architecture is presented. To train this network, a synthetic event-based dataset is simulated with accurate bounding box annotations, called Neuromorphic HELEN. Additionally, a method to detect and analyse drivers eye blinks is proposed, exploiting the high temporal resolution of event cameras. Behaviour of blinking provides greater insights into a driver level of fatigue or drowsiness. We show that blinks have a unique temporal signature that can be better captured by event cameras.