This paper reviews and proposes concerns in adopting, fielding, and maintaining artificial intelligence (AI) systems. While the AI community has made rapid progress, there are challenges in certifying AI systems. Using procedures from design and operational test and evaluation, there are opportunities towards determining performance bounds to manage expectations of intended use. A notional use case is presented with image data fusion to support AI object recognition certifiability considering precision versus distance.
During natural disasters, aircraft and satellites are used to survey the impacted regions. Usually human experts are needed to manually label the degrees of the building damage so that proper humanitarian assistance and disaster response (HADR) can be achieved, which is labor-intensive and time-consuming. Expecting human labeling of major disasters over a wide area gravely slows down the HADR efforts. It is thus of crucial interest to take advantage of the cutting-edge Artificial Intelligence and Machine Learning techniques to speed up the natural infrastructure damage assessment process to achieve effective HADR. Accordingly, the paper demonstrates a systematic effort to achieve efficient building damage classification. First, two novel generative adversarial nets (GANs) are designed to augment data used to train the deep-learning-based classifier. Second, a contrastive learning based method using novel data structures is developed to achieve great performance. Third, by using information fusion, the classifier is effectively trained with very few training data samples for transfer learning. All the classifiers are small enough to be loaded in a smart phone or simple laptop for first responders. Based on the available overhead imagery dataset, results demonstrate data and computational efficiency with 10% of the collected data combined with a GAN reducing the time of computation from roughly half a day to about 1 hour with roughly similar classification performances.
The paper describes a Multisource AI Scorecard Table (MAST) that provides the developer and user of an artificial intelligence (AI)/machine learning (ML) system with a standard checklist focused on the principles of good analysis adopted by the intelligence community (IC) to help promote the development of more understandable systems and engender trust in AI outputs. Such a scorecard enables a transparent, consistent, and meaningful understanding of AI tools applied for commercial and government use. A standard is built on compliance and agreement through policy, which requires buy-in from the stakeholders. While consistency for testing might only exist across a standard data set, the community requires discussion on verification and validation approaches which can lead to interpretability, explainability, and proper use. The paper explores how the analytic tradecraft standards outlined in Intelligence Community Directive (ICD) 203 can provide a framework for assessing the performance of an AI system supporting various operational needs. These include sourcing, uncertainty, consistency, accuracy, and visualization. Three use cases are presented as notional examples that support security for comparative analysis.
Wildfires are one of the costliest and deadliest natural disasters in the US, causing damage to millions of hectares of forest resources and threatening the lives of people and animals. Of particular importance are risks to firefighters and operational forces, which highlights the need for leveraging technology to minimize danger to people and property. FLAME (Fire Luminosity Airborne-based Machine learning Evaluation) offers a dataset of aerial images of fires along with methods for fire detection and segmentation which can help firefighters and researchers to develop optimal fire management strategies. This paper provides a fire image dataset collected by drones during a prescribed burning piled detritus in an Arizona pine forest. The dataset includes video recordings and thermal heatmaps captured by infrared cameras. The captured videos and images are annotated and labeled frame-wise to help researchers easily apply their fire detection and modeling algorithms. The paper also highlights solutions to two machine learning problems: (1) Binary classification of video frames based on the presence [and absence] of fire flames. An Artificial Neural Network (ANN) method is developed that achieved a 76% classification accuracy. (2) Fire detection using segmentation methods to precisely determine fire borders. A deep learning method is designed based on the U-Net up-sampling and down-sampling approach to extract a fire mask from the video frames. Our FLAME method approached a precision of 92% and a recall of 84%. Future research will expand the technique for free burning broadcast fire using thermal images.
Situation AWareness (SAW) is essential for many mission critical applications. However, SAW is very challenging when trying to immediately identify objects of interest or zoom in on suspicious activities from thousands of video frames. This work aims at developing a queryable system to instantly select interesting content. While face recognition technology is mature, in many scenarios like public safety monitoring, the features of objects of interest may be much more complicated than face features. In addition, human operators may not be always able to provide a descriptive, simple, and accurate query. Actually, it is more often that there are only rough, general descriptions of certain suspicious objects or accidents. This paper proposes an Interactive Video Surveillance as an Edge service (I-ViSE) based on unsupervised feature queries. Adopting unsupervised methods that do not reveal any private information, the I-ViSE scheme utilizes general features of a human body and color of clothes. An I-ViSE prototype is built following the edge-fog computing paradigm and the experimental results verified the I-ViSE scheme meets the design goal of scene recognition in less than two seconds.
Urban imagery usually serves as forensic analysis and by design is available for incident mitigation. As more imagery collected, it is harder to narrow down to certain frames among thousands of video clips to a specific incident. A real-time, proactive surveillance system is desirable, which could instantly detect dubious personnel, identify suspicious activities, or raise momentous alerts. The recent proliferation of the edge computing paradigm allows more data-intensive tasks to be accomplished by smart edge devices with lightweight but powerful algorithms. This paper presents a forensic surveillance strategy by introducing an Instant Suspicious Activity identiFication at the Edge (I-SAFE) using fuzzy decision making. A fuzzy control system is proposed to mimic the decision-making process of a security officer. Decisions are made based on video features extracted by a lightweight Deep Machine Learning (DML) model. Based on the requirements from the first-line law enforcement officers, several features are selected and fuzzified to cope with the state of uncertainty that exists in the officers' decision-making process. Using features in the edge hierarchy minimizes the communication delay such that instant alerting is achieved. Additionally, leveraging the Microservices architecture, the I-SAFE scheme possesses good scalability given the increasing complexities at the network edge. Implemented as an edge-based application and tested using exemplary and various labeled dataset surveillance videos, the I-SAFE scheme raises alerts by identifying the suspicious activity in an average of 0.002 seconds. Compared to four other state-of-the-art methods over two other data sets, the experimental study verified the superiority of the I-SAFE decentralized method.
Detecting objects in aerial images is challenging for at least two reasons: (1) target objects like pedestrians are very small in terms of pixels, making them hard to be distinguished from surrounding background; and (2) targets are in general very sparsely and nonuniformly distributed, making the detection very inefficient. In this paper we address both issues inspired by the observation that these targets are often clustered. In particular, we propose a Clustered Detection (ClusDet) network that unifies object cluster and detection in an end-to-end framework. The key components in ClusDet include a cluster proposal sub-network (CPNet), a scale estimation sub-network (ScaleNet), and a dedicated detection network (DetecNet). Given an input image, CPNet produces (object) cluster regions and ScaleNet estimates object scales for these regions. Then, each scale-normalized cluster region and their features are fed into DetecNet for object detection. Compared with previous solutions, ClusDet has several advantages: (1) it greatly reduces the number of blocks for final object detection and hence achieves high running time efficiency, (2) the cluster-based scale estimation is more accurate than previously used single-object based ones, hence effectively improves the detection for small objects, and (3) the final DetecNet is dedicated for clustered regions and implicitly models the prior context information so as to boost detection accuracy. The proposed method is tested on three representative aerial image datasets including VisDrone, UAVDT and DOTA. In all the experiments, ClusDet achieves promising performance in both efficiency and accuracy, in comparison with state-of-the-art detectors.
Context enhancement is critical for night vision (NV) applications, especially for the dark night situation without any artificial lights. In this paper, we present the infrared-to-visual (IR2VI) algorithm, a novel unsupervised thermal-to-visible image translation framework based on generative adversarial networks (GANs). IR2VI is able to learn the intrinsic characteristics from VI images and integrate them into IR images. Since the existing unsupervised GAN-based image translation approaches face several challenges, such as incorrect mapping and lack of fine details, we propose a structure connection module and a region-of-interest (ROI) focal loss method to address the current limitations. Experimental results show the superiority of the IR2VI algorithm over baseline methods.
Bird strikes present a huge risk for aircraft, especially since traditional airport bird surveillance is mainly dependent on inefficient human observation. Computer vision based technology has been proposed to automatically detect birds, determine bird flying trajectories, and predict aircraft takeoff delays. However, the characteristics of bird flight using imagery and the performance of existing methods applied to flying bird task are not well known. Therefore, we perform infrared flying bird tracking experiments using 12 state-of-the-art algorithms on a real BIRDSITE-IR dataset to obtain useful clues and recommend feature analysis. We also develop a Struck-scale method to demonstrate the effectiveness of multiple scale sampling adaption in handling the object of flying bird with varying shape and scale. The general analysis can be used to develop specialized bird tracking methods for airport safety, wildness and urban bird population studies.