We analyze the data-dependent capacity of neural networks and assess anomalies in inputs from the perspective of networks during inference. The notion of data-dependent capacity allows for analyzing the knowledge base of a model populated by learned features from training data. We define purview as the additional capacity necessary to characterize inference samples that differ from the training data. To probe the purview of a network, we utilize gradients to measure the amount of change required for the model to characterize the given inputs more accurately. To eliminate the dependency on ground-truth labels in generating gradients, we introduce confounding labels that are formulated by combining multiple categorical labels. We demonstrate that our gradient-based approach can effectively differentiate inputs that cannot be accurately represented with learned features. We utilize our approach in applications of detecting anomalous inputs, including out-of-distribution, adversarial, and corrupted samples. Our approach requires no hyperparameter tuning or additional data processing and outperforms state-of-the-art methods by up to 2.7%, 19.8%, and 35.6% of AUROC scores, respectively.
We propose to utilize gradients for detecting adversarial and out-of-distribution samples. We introduce confounding labels -- labels that differ from normal labels seen during training -- in gradient generation to probe the effective expressivity of neural networks. Gradients depict the amount of change required for a model to properly represent given inputs, providing insight into the representational power of the model established by network architectural properties as well as training data. By introducing a label of different design, we remove the dependency on ground truth labels for gradient generation during inference. We show that our gradient-based approach allows for capturing the anomaly in inputs based on the effective expressivity of the models with no hyperparameter tuning or additional processing, and outperforms state-of-the-art methods for adversarial and out-of-distribution detection.
Neural networks for image classification tasks assume that any given image during inference belongs to one of the training classes. This closed-set assumption is challenged in real-world applications where models may encounter inputs of unknown classes. Open-set recognition aims to solve this problem by rejecting unknown classes while classifying known classes correctly. In this paper, we propose to utilize gradient-based representations obtained from a known classifier to train an unknown detector with instances of known classes only. Gradients correspond to the amount of model updates required to properly represent a given sample, which we exploit to understand the model's capability to characterize inputs with its learned features. Our approach can be utilized with any classifier trained in a supervised manner on known classes without the need to model the distribution of unknown samples explicitly. We show that our gradient-based approach outperforms state-of-the-art methods by up to 11.6% in open-set classification.
Despite tremendous success of modern neural networks, they are known to be overconfident even when the model encounters inputs with unfamiliar conditions. Detecting such inputs is vital to preventing models from making naive predictions that may jeopardize real-world applications of neural networks. In this paper, we address the challenging problem of devising a simple yet effective measure of uncertainty in deep neural networks. Specifically, we propose to utilize backpropagated gradients to quantify the uncertainty of trained models. Gradients depict the required amount of change for a model to properly represent given inputs, thus providing a valuable insight into how familiar and certain the model is regarding the inputs. We demonstrate the effectiveness of gradients as a measure of model uncertainty in applications of detecting unfamiliar inputs, including out-of-distribution and corrupted samples. We show that our gradient-based method outperforms state-of-the-art methods by up to 4.8% of AUROC score in out-of-distribution detection and 35.7% in corrupted input detection.
In this paper, we investigate the reliability of online recognition platforms, Amazon Rekognition and Microsoft Azure, with respect to changes in background, acquisition device, and object orientation. We focus on platforms that are commonly used by the public to better understand their real-world performances. To assess the variation in recognition performance, we perform a controlled experiment by changing the acquisition conditions one at a time. We use three smartphones, one DSLR, and one webcam to capture side views and overhead views of objects in a living room, an office, and photo studio setups. Moreover, we introduce a framework to estimate the recognition performance with respect to backgrounds and orientations. In this framework, we utilize both handcrafted features based on color, texture, and shape characteristics and data-driven features obtained from deep neural networks. Experimental results show that deep learning-based image representations can estimate the recognition performance variation with a Spearman's rank-order correlation of 0.94 under multifarious acquisition conditions.
In this paper, we introduce a large-scale, controlled, and multi-platform object recognition dataset denoted as Challenging Unreal and Real Environments for Object Recognition (CURE-OR). In this dataset, there are 1,000,000 images of 100 objects with varying size, color, and texture that are positioned in five different orientations and captured using five devices including a webcam, a DSLR, and three smartphone cameras in real-world (real) and studio (unreal) environments. The controlled challenging conditions include underexposure, overexposure, blur, contrast, dirty lens, image noise, resizing, and loss of color information. We utilize CURE-OR dataset to test recognition APIs-Amazon Rekognition and Microsoft Azure Computer Vision- and show that their performance significantly degrades under challenging conditions. Moreover, we investigate the relationship between object recognition and image quality and show that objective quality algorithms can estimate recognition performance under certain photometric challenging conditions. The dataset is publicly available at https://ghassanalregib.com/cure-or/.