Abstract:Human detection in videos plays an important role in various real-life applications. Most traditional approaches depend on utilizing handcrafted features, which are problem-dependent and optimal for specific tasks. Moreover, they are highly susceptible to dynamical events such as illumination changes, camera jitter, and variations in object sizes. On the other hand, the proposed feature learning approaches are cheaper and easier because highly abstract and discriminative features can be produced automatically without the need of expert knowledge. In this paper, we utilize automatic feature learning methods, which combine optical flow and three different deep models (i.e., supervised convolutional neural network (S-CNN), pretrained CNN feature extractor, and hierarchical extreme learning machine) for human detection in videos captured using a nonstatic camera on an aerial platform with varying altitudes. The models are trained and tested on the publicly available and highly challenging UCF-ARG aerial dataset. The comparison between these models in terms of training, testing accuracy, and learning speed is analyzed. The performance evaluation considers five human actions (digging, waving, throwing, walking, and running). Experimental results demonstrated that the proposed methods are successful for the human detection task. The pretrained CNN produces an average accuracy of 98.09%. S-CNN produces an average accuracy of 95.6% with softmax and 91.7% with Support Vector Machines (SVM). H-ELM has an average accuracy of 95.9%. Using a normal Central Processing Unit (CPU), H-ELM's training time takes 445 seconds. Learning in S-CNN takes 770 seconds with a high-performance Graphical Processing Unit (GPU).
Abstract:Intrusion Detection Systems (IDS) play a vital role in modern cybersecurity frameworks by providing a primary defense mechanism against sophisticated threat actors. In this paper, we propose an explainable intrusion detection framework that integrates Recursive Feature Elimination (RFE) with Random Forest (RF) to enhance detection of Advanced Persistent Threats (APTs). By using CICIDS2017 dataset, the approach begins with comprehensive data preprocessing and narrows down the most significant features via RFE. A Random Forest (RF) model was trained on the refined feature set, with SHapley Additive exPlanations (SHAP) used to interpret the contribution of each selected feature. Our experiment demonstrates that the explainable RF-RFE achieved a detection accuracy of 99.9%, reducing false positive and computational cost in comparison to traditional classifiers. The findings underscore the effectiveness of integrating explainable AI and feature selection to develop a robust, transparent, and deployable IDS solution.




Abstract:The human action classification task is a widely researched topic and is still an open problem. Many state-of-the-arts approaches involve the usage of bag-of-video-words with spatio-temporal local features to construct characterizations for human actions. In order to improve beyond this standard approach, we investigate the usage of co-occurrences between local features. We propose the usage of co-occurrences information to characterize human actions. A trade-off factor is used to define an optimal trade-off between vocabulary size and classification rate. Next, a spatio-temporal co-occurrence technique is applied to extract co-occurrence information between labeled local features. Novel characterizations for human actions are then constructed. These include a vector quantized correlogram-elements vector, a highly discriminative PCA (Principal Components Analysis) co-occurrence vector and a Haralick texture vector. Multi-channel kernel SVM (support vector machine) is utilized for classification. For evaluation, the well known KTH as well as the challenging UCF-Sports action datasets are used. We obtained state-of-the-arts classification performance. We also demonstrated that we are able to fully utilize co-occurrence information, and improve the standard bag-of-video-words approach.