Although deep learning has demonstrated astonishing performance in many applications, there are still concerns on their dependability. One desirable property of deep learning applications with societal impact is fairness (i.e., non-discrimination). Unfortunately, discrimination might be intrinsically embedded into the models due to discrimination in the training data. As a countermeasure, fairness testing systemically identifies discriminative samples, which can be used to retrain the model and improve its fairness. Existing fairness testing approaches however have two major limitations. First, they only work well on traditional machine learning models and have poor performance (e.g., effectiveness and efficiency) on deep learning models. Second, they only work on simple tabular data and are not applicable for domains such as text. In this work, we bridge the gap by proposing a scalable and effective approach for systematically searching for discriminative samples while extending fairness testing to address a challenging domain, i.e., text classification. Compared with state-of-the-art methods, our approach only employs lightweight procedures like gradient computation and clustering, which makes it significantly more scalable. Experimental results show that on average, our approach explores the search space more effectively (9.62 and 2.38 times more than the state-of-art methods respectively on tabular and text datasets) and generates much more individual discriminatory instances (24.95 and 2.68 times) within reasonable time. The retrained models reduce discrimination by 57.2% and 60.2% respectively on average.
Cyber-physical systems (CPSs) are widespread in critical domains, and significant damage can be caused if an attacker is able to modify the code of their programmable logic controllers (PLCs). Unfortunately, traditional techniques for attesting code integrity (i.e. verifying that it has not been modified) rely on firmware access or roots-of-trust, neither of which proprietary or legacy PLCs are likely to provide. In this paper, we propose a practical code integrity checking solution based on privacy-preserving black box models that instead attest the input/output behaviour of PLC programs. Using faithful offline copies of the PLC programs, we identify their most important inputs through an information flow analysis, execute them on multiple combinations to collect data, then train neural networks able to predict PLC outputs (i.e. actuator commands) from their inputs. By exploiting the black box nature of the model, our solution maintains the privacy of the original PLC code and does not assume that attackers are unaware of its presence. The trust instead comes from the fact that it is extremely hard to attack the PLC code and neural networks at the same time and with consistent outcomes. We evaluated our approach on a modern six-stage water treatment plant testbed, finding that it could predict actuator states from PLC inputs with near-100% accuracy, and thus could detect all 120 effective code mutations that we subjected the PLCs to. Finally, we found that it is not practically possible to simultaneously modify the PLC code and apply discreet adversarial noise to our attesters in a way that leads to consistent (mis-)predictions.
The threats faced by cyber-physical systems (CPSs) in critical infrastructure have motivated research into a multitude of attack detection mechanisms, including anomaly detectors based on neural network models. The effectiveness of anomaly detectors can be assessed by subjecting them to test suites of attacks, but less consideration has been given to adversarial attackers that craft noise specifically designed to deceive them. While successfully applied in domains such as images and audio, adversarial attacks are much harder to implement in CPSs due to the presence of other built-in defence mechanisms such as rule checkers(or invariant checkers). In this work, we present an adversarial attack that simultaneously evades the anomaly detectors and rule checkers of a CPS. Inspired by existing gradient-based approaches, our adversarial attack crafts noise over the sensor and actuator values, then uses a genetic algorithm to optimise the latter, ensuring that the neural network and the rule checking system are both deceived.We implemented our approach for two real-world critical infrastructure testbeds, successfully reducing the classification accuracy of their detectors by over 50% on average, while simultaneously avoiding detection by rule checkers. Finally, we explore whether these attacks can be mitigated by training the detectors on adversarial samples.
Accurate pedestrian classification and localization have received considerable attention due to their wide applications such as security monitoring, autonomous driving, etc. Although pedestrian detectors have made great progress in recent years, the fixed Intersection over Union (IoU) based assignment-regression manner still limits their performance. Two main factors are responsible for this: 1) the IoU threshold faces a dilemma that a lower one will result in more false positives, while a higher one will filter out the matched positives; 2) the IoU-based GT-Proposal assignment suffers from the inconsistent supervision problem that spatially adjacent proposals with similar features are assigned to different ground-truth boxes, which means some very similar proposals may be forced to regress towards different targets, and thus confuses the bounding-box regression when predicting the location results. In this paper, we first put forward the question that \textbf{Regression Direction} would affect the performance for pedestrian detection. Consequently, we address the weakness of IoU by introducing one geometric sensitive search algorithm as a new assignment and regression metric. Different from the previous IoU-based \textbf{one-to-one} assignment manner of one proposal to one ground-truth box, the proposed method attempts to seek a reasonable matching between the sets of proposals and ground-truth boxes. Specifically, we boost the MR-FPPI under R$_{75}$ by 8.8\% on Citypersons dataset. Furthermore, by incorporating this method as a metric into the state-of-the-art pedestrian detectors, we show a consistent improvement.
As a new programming paradigm, deep learning has expanded its application to many real-world problems. At the same time, deep learning based software are found to be vulnerable to adversarial attacks. Though various defense mechanisms have been proposed to improve robustness of deep learning software, many of them are ineffective against adaptive attacks. In this work, we propose a novel characterization to distinguish adversarial examples from benign ones based on the observation that adversarial examples are significantly less robust than benign ones. As existing robustness measurement does not scale to large networks, we propose a novel defense framework, named attack as defense (A2D), to detect adversarial examples by effectively evaluating an example's robustness. A2D uses the cost of attacking an input for robustness evaluation and identifies those less robust examples as adversarial since less robust examples are easier to attack. Extensive experiment results on MNIST, CIFAR10 and ImageNet show that A2D is more effective than recent promising approaches. We also evaluate our defence against potential adaptive attacks and show that A2D is effective in defending carefully designed adaptive attacks, e.g., the attack success rate drops to 0% on CIFAR10.
Distortion quantification of point clouds plays a stealth, yet vital role in a wide range of human and machine perception tasks. For human perception tasks, a distortion quantification can substitute subjective experiments to guide 3D visualization; while for machine perception tasks, a distortion quantification can work as a loss function to guide the training of deep neural networks for unsupervised learning tasks. To handle a variety of demands in many applications, a distortion quantification needs to be distortion discriminable, differentiable, and have a low computational complexity. Currently, however, there is a lack of a general distortion quantification that can satisfy all three conditions. To fill this gap, this work proposes multiscale potential energy discrepancy (MPED), a distortion quantification to measure point cloud geometry and color difference. By evaluating at various neighborhood sizes, the proposed MPED achieves global-local tradeoffs, capturing distortion in a multiscale fashion. Extensive experimental studies validate MPED's superiority for both human and machine perception tasks.
The prevalent perspectives of scene text recognition are from sequence to sequence (seq2seq) and segmentation. In this paper, we propose a new perspective on scene text recognition, in which we model the scene text recognition as an image classification problem. Based on the image classification perspective, a scene text recognition model is proposed, which is named as CSTR. The CSTR model consists of a series of convolutional layers and a global average pooling layer at the end, followed by independent multi-class classification heads, each of which predicts the corresponding character of the word sequence in input image. The CSTR model is easy to train using parallel cross entropy losses. CSTR is as simple as image classification models like ResNet \cite{he2016deep} which makes it easy to implement, and the fully convolutional neural network architecture makes it efficient to train and deploy. We demonstrate the effectiveness of the classification perspective on scene text recognition with thorough experiments. Futhermore, CSTR achieves nearly state-of-the-art performance on six public benchmarks including regular text, irregular text. The code will be available at https://github.com/Media-Smart/vedastr.
Pedestrian detection benefits greatly from deep convolutional neural networks (CNNs). However, it is inherently hard for CNNs to handle situations in the presence of occlusion and scale variation. In this paper, we propose W$^3$Net, which attempts to address above challenges by decomposing the pedestrian detection task into \textbf{\textit{W}}here, \textbf{\textit{W}}hat and \textbf{\textit{W}}hether problem directing against pedestrian localization, scale prediction and classification correspondingly. Specifically, for a pedestrian instance, we formulate its feature by three steps. i) We generate a bird view map, which is naturally free from occlusion issues, and scan all points on it to look for suitable locations for each pedestrian instance. ii) Instead of utilizing pre-fixed anchors, we model the interdependency between depth and scale aiming at generating depth-guided scales at different locations for better matching instances of different sizes. iii) We learn a latent vector shared by both visual and corpus space, by which false positives with similar vertical structure but lacking human partial features would be filtered out. We achieve state-of-the-art results on widely used datasets (Citypersons and Caltech). In particular. when evaluating on heavy occlusion subset, our results reduce MR$^{-2}$ from 49.3$\%$ to 18.7$\%$ on Citypersons, and from 45.18$\%$ to 28.33$\%$ on Caltech.
Neural networks are increasingly applied to support decision making in safety-critical applications (like autonomous cars, unmanned aerial vehicles and face recognition based authentication). While many impressive static verification techniques have been proposed to tackle the correctness problem of neural networks, it is possible that static verification may never be sufficiently scalable to handle real-world neural networks. In this work, we propose a runtime verification method to ensure the correctness of neural networks. Given a neural network and a desirable safety property, we adopt state-of-the-art static verification techniques to identify strategically locations to introduce additional gates which "correct" neural network behaviors at runtime. Experiment results show that our approach effectively generates neural networks which are guaranteed to satisfy the properties, whilst being consistent with the original neural network most of the time.
We propose a spurious region guided refinement approach for robustness verification of deep neural networks. Our method starts with applying the DeepPoly abstract domain to analyze the network. If the robustness property cannot be verified, the result is inconclusive. Due to the over-approximation, the computed region in the abstraction may be spurious in the sense that it does not contain any true counterexample. Our goal is to identify such spurious regions and use them to guide the abstraction refinement. The core idea is to make use of the obtained constraints of the abstraction to infer new bounds for the neurons. This is achieved by linear programming techniques. With the new bounds, we iteratively apply DeepPoly, aiming to eliminate spurious regions. We have implemented our approach in a prototypical tool DeepSRGR. Experimental results show that a large amount of regions can be identified as spurious, and as a result, the precision of DeepPoly can be significantly improved. As a side contribution, we show that our approach can be applied to verify quantitative robustness properties.