The early-stage Alzheimer's disease (AD) detection has been considered an important field of medical studies. Like traditional machine learning methods, speech-based automatic detection also suffers from data privacy risks because the data of specific patients are exclusive to each medical institution. A common practice is to use federated learning to protect the patients' data privacy. However, its distributed learning process also causes performance reduction. To alleviate this problem while protecting user privacy, we propose a federated contrastive pre-training (FedCPC) performed before federated training for AD speech detection, which can learn a better representation from raw data and enables different clients to share data in the pre-training and training stages. Experimental results demonstrate that the proposed methods can achieve satisfactory performance while preserving data privacy.
Multimodal sentiment analysis is an important area for understanding the user's internal states. Deep learning methods were effective, but the problem of poor interpretability has gradually gained attention. Previous works have attempted to use attention weights or vector distributions to provide interpretability. However, their explanations were not intuitive and can be influenced by different trained models. This study proposed a novel approach to provide interpretability by converting nonverbal modalities into text descriptions and by using large-scale language models for sentiment predictions. This provides an intuitive approach to directly interpret what models depend on with respect to making decisions from input texts, thus significantly improving interpretability. Specifically, we convert descriptions based on two feature patterns for the audio modality and discrete action units for the facial modality. Experimental results on two sentiment analysis tasks demonstrated that the proposed approach maintained, or even improved effectiveness for sentiment analysis compared to baselines using conventional features, with the highest improvement of 2.49% on the F1 score. The results also showed that multimodal descriptions have similar characteristics on fusing modalities as those of conventional fusion methods. The results demonstrated that the proposed approach is interpretable and effective for multimodal sentiment analysis.
Driving assistance systems that support drivers by adapting individual psychological characteristics can provide appropriate feedback and prevent traffic accidents. As a first step toward implementing such adaptive assistance systems, this research aims to develop a model to estimate drivers' psychological characteristics, such as cognitive function, psychological driving style, and workload sensitivity, from on-road driving behavioral data using machine learning and deep learning techniques. We also investigated the relationship between driving behavior and various cognitive functions including the Trail Making test and Useful Field of View test through regression modeling. The proposed method focuses on road type information and captures various durations of time-series data observed from driving behaviors. First, we segment the driving time-series data into two road types, namely, arterial roads and intersections, to consider driving situations. Second, we further segment data into many sequences of various durations. Third, statistics are calculated from each sequence. Finally, these statistics are used as input features of machine learning models to predict psychological characteristics. The experimental results show that our model can predict a driver's cognitive function, namely, the Trail Making Test version B and Useful Field of View test scores, with Pearson correlation coefficients $r$ of 0.579 and 0.557, respectively. Some characteristics, such as psychological driving style and workload sensitivity, are predicted with high accuracy, but whether various duration segmentation improves accuracy depends on the characteristics, and it is not effective for all characteristics. Additionally, we reveal important sensor and road types for the estimation of cognitive function.
As the importance of intrusion detection and prevention systems (IDPSs) increases, great costs are incurred to manage the signatures that are generated by malicious communication pattern files. Experts in network security need to classify signatures by importance for an IDPS to work. We propose and evaluate a machine learning signature classification model with a reject option (RO) to reduce the cost of setting up an IDPS. To train the proposed model, it is essential to design features that are effective for signature classification. Experts classify signatures with predefined if-then rules. An if-then rule returns a label of low, medium, high, or unknown importance based on keyword matching of the elements in the signature. Therefore, we first design two types of features, symbolic features (SFs) and keyword features (KFs), which are used in keyword matching for the if-then rules. Next, we design web information and message features (WMFs) to capture the properties of signatures that do not match the if-then rules. The WMFs are extracted as term frequency-inverse document frequency (TF-IDF) features of the message text in the signatures. The features are obtained by web scraping from the referenced external attack identification systems described in the signature. Because failure needs to be minimized in the classification of IDPS signatures, as in the medical field, we consider introducing a RO in our proposed model. The effectiveness of the proposed classification model is evaluated in experiments with two real datasets composed of signatures labeled by experts: a dataset that can be classified with if-then rules and a dataset with elements that do not match an if-then rule. In the experiment, the proposed model is evaluated. In both cases, the combined SFs and WMFs performed better than the combined SFs and KFs. In addition, we also performed feature analysis.