Abstract:This paper considers the adaptation of the e-coaching concept at times of emergencies and disasters, through aiding the e-coaching with intelligent tools for monitoring humans' affective state. The states such as anxiety, panic, avoidance, and stress, if properly detected, can be mitigated using the e-coaching tactic and strategy. In this work, we focus on a stress monitoring assistant tool developed on machine learning techniques. We provide the results of an experimental study using the proposed method.
Abstract:In this paper, we investigate hand gesture classifiers that rely upon the abstracted 'skeletal' data recorded using the RGB-Depth sensor. We focus on 'skeletal' data represented by the body joint coordinates, from the Praxis dataset. The PRAXIS dataset contains recordings of patients with cortical pathologies such as Alzheimer's disease, performing a Praxis test under the direction of a clinician. In this paper, we propose hand gesture classifiers that are more effective with the PRAXIS dataset than previously proposed models. Body joint data offers a compressed form of data that can be analyzed specifically for hand gesture recognition. Using a combination of windowing techniques with deep learning architecture such as a Recurrent Neural Network (RNN), we achieved an overall accuracy of 70.8% using only body joint data. In addition, we investigated a long-short-term-memory (LSTM) to extract and analyze the movement of the joints through time to recognize the hand gestures being performed and achieved a gesture recognition rate of 74.3% and 67.3% for static and dynamic gestures, respectively. The proposed approach contributed to the task of developing an automated, accurate, and inexpensive approach to diagnosing cortical pathologies for multiple healthcare applications.
Abstract:Accelerometry has been extensively studied as an objective means of measuring upper limb function in patients post-stroke. The objective of this paper is to determine whether the accelerometry-derived measurements frequently used in more long-term rehabilitation studies can also be used to monitor and rapidly detect sudden changes in upper limb motor function in more recently hospitalized stroke patients. Six binary classification models were created by training on variable data window times of paretic upper limb accelerometer feature data. The models were assessed on their effectiveness for differentiating new input data into two classes: severe or moderately severe motor function. The classification models yielded Area Under the Curve (AUC) scores that ranged from 0.72 to 0.82 for 15-minute data windows to 0.77 to 0.94 for 120-minute data windows. These results served as a preliminary assessment and a basis on which to further investigate the efficacy of using accelerometry and machine learning to alert healthcare professionals to rapid changes in motor function in the days immediately following a stroke.
Abstract:Designing efficient and labor-saving prosthetic hands requires powerful hand gesture recognition algorithms that can achieve high accuracy with limited complexity and latency. In this context, the paper proposes a compact deep learning framework referred to as the CT-HGR, which employs a vision transformer network to conduct hand gesture recognition using highdensity sEMG (HD-sEMG) signals. The attention mechanism in the proposed model identifies similarities among different data segments with a greater capacity for parallel computations and addresses the memory limitation problems while dealing with inputs of large sequence lengths. CT-HGR can be trained from scratch without any need for transfer learning and can simultaneously extract both temporal and spatial features of HD-sEMG data. Additionally, the CT-HGR framework can perform instantaneous recognition using sEMG image spatially composed from HD-sEMG signals. A variant of the CT-HGR is also designed to incorporate microscopic neural drive information in the form of Motor Unit Spike Trains (MUSTs) extracted from HD-sEMG signals using Blind Source Separation (BSS). This variant is combined with its baseline version via a hybrid architecture to evaluate potentials of fusing macroscopic and microscopic neural drive information. The utilized HD-sEMG dataset involves 128 electrodes that collect the signals related to 65 isometric hand gestures of 20 subjects. The proposed CT-HGR framework is applied to 31.25, 62.5, 125, 250 ms window sizes of the above-mentioned dataset utilizing 32, 64, 128 electrode channels. The average accuracy over all the participants using 32 electrodes and a window size of 31.25 ms is 86.23%, which gradually increases till reaching 91.98% for 128 electrodes and a window size of 250 ms. The CT-HGR achieves accuracy of 89.13% for instantaneous recognition based on a single frame of HD-sEMG image.
Abstract:Mission teams are exposed to the emotional toll of life and death decisions. These are small groups of specially trained people supported by intelligent machines for dealing with stressful environments and scenarios. We developed a composite model for stress monitoring in such teams of human and autonomous machines. This modelling aims to identify the conditions that may contribute to mission failure. The proposed model is composed of three parts: 1) a computational logic part that statically describes the stress states of teammates; 2) a decision part that manifests the mission status at any time; 3) a stress propagation part based on standard Susceptible-Infected-Susceptible (SIS) paradigm. In contrast to the approaches such as agent-based, random-walk and game models, the proposed model combines various mechanisms to satisfy the conditions of stress propagation in small groups. Our core approach involves data structures such as decision tables and decision diagrams. These tools are adaptable to human-machine teaming as well.
Abstract:In this paper, we study performance and fairness on visual and thermal images and expand the assessment to masked synthetic images. Using the SpeakingFace and Thermal-Mask dataset, we propose a process to assess fairness on real images and show how the same process can be applied to synthetic images. The resulting process shows a demographic parity difference of 1.59 for random guessing and increases to 5.0 when the recognition performance increases to a precision and recall rate of 99.99\%. We indicate that inherently biased datasets can deeply impact the fairness of any biometric system. A primary cause of a biased dataset is the class imbalance due to the data collection process. To address imbalanced datasets, the classes with fewer samples can be augmented with synthetic images to generate a more balanced dataset resulting in less bias when training a machine learning system. For biometric-enabled systems, fairness is of critical importance, while the related concept of Equity, Diversity, and Inclusion (EDI) is well suited for the generalization of fairness in biometrics, in this paper, we focus on the 3 most common demographic groups age, gender, and ethnicity.
Abstract:In this study of the face recognition on masked versus unmasked faces generated using Flickr-Faces-HQ and SpeakingFaces datasets, we report 36.78% degradation of recognition performance caused by the mask-wearing at the time of pandemics, in particular, in border checkpoint scenarios. We have achieved better performance and reduced the degradation to 1.79% using advanced deep learning approaches in the cross-spectral domain.
Abstract:This paper considers e-coaching at times of pandemic. It utilizes the Emergency Management Cycle (EMC), a core doctrine for managing disasters. The EMC dimensions provide a useful taxonomical view for the development and application of e-coaching systems, emphasizing technological and societal issues. Typical pandemic symptoms such as anxiety, panic, avoidance, and stress, if properly detected, can be mitigated using the e-coaching tactic and strategy. In this work, we focus on a stress monitoring assistant developed upon machine learning techniques. We provide the results of an experimental study of a prototype of such an assistant. Our study leads to the conclusion that stress monitoring shall become a valuable component of e-coaching at all EMC phases.
Abstract:Wide dynamic range (WDR) images contain more scene details and contrast when compared to common images. However, it requires tone mapping to process the pixel values in order to display properly. The details of WDR images can diminish during the tone mapping process. In this work, we address the problem by combining a novel reformulated Laplacian pyramid and deep learning. The reformulated Laplacian pyramid always decompose a WDR image into two frequency bands where the low-frequency band is global feature-oriented, and the high-frequency band is local feature-oriented. The reformulation preserves the local features in its original resolution and condenses the global features into a low-resolution image. The generated frequency bands are reconstructed and fine-tuned to output the final tone mapped image that can display on the screen with minimum detail and contrast loss. The experimental results demonstrate that the proposed method outperforms state-of-the-art WDR image tone mapping methods. The code is made publicly available at https://github.com/linmc86/Deep-Reformulated-Laplacian-Tone-Mapping.
Abstract:Currently, face detection approaches focus on facial information by varying specific parameters including pose, occlusion, lighting, background, race, and gender. These studies only utilized the information obtained from low dynamic range images, however, face detection in wide dynamic range (WDR) scenes has received little attention. To our knowledge, there is no publicly available WDR database for face detection research. To facilitate and support future face detection research in the WDR field, we propose the first WDR database for face detection, called WDR FACE, which contains a total of 398 16-bit megapixel grayscale wide dynamic range images collected from 29 subjects. These WDR images (WDRIs) were taken in eight specific WDR scenes. The dynamic range of 90% images surpasses 60,000:1, and that of 70% images exceeds 65,000:1. Furthermore, we show the effect of different face detection procedures on the WDRIs in our database. This is done with 25 different tone mapping operators and five different face detectors. We provide preliminary experimental results of face detection on this unique WDR database.