Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Soheil Feizi

Core Risk Minimization using Salient ImageNet

Mar 28, 2022

Sahil Singla, Mazda Moayeri, Soheil Feizi

Figure 1 for Core Risk Minimization using Salient ImageNet

Figure 2 for Core Risk Minimization using Salient ImageNet

Figure 3 for Core Risk Minimization using Salient ImageNet

Figure 4 for Core Risk Minimization using Salient ImageNet

Abstract:Deep neural networks can be unreliable in the real world especially when they heavily use spurious features for their predictions. Recently, Singla & Feizi (2022) introduced the Salient Imagenet dataset by annotating and localizing core and spurious features of ~52k samples from 232 classes of Imagenet. While this dataset is useful for evaluating the reliance of pretrained models on spurious features, its small size limits its usefulness for training models. In this work, we first introduce the Salient Imagenet-1M dataset with more than 1 million soft masks localizing core and spurious features for all 1000 Imagenet classes. Using this dataset, we first evaluate the reliance of several Imagenet pretrained models (42 total) on spurious features and observe that: (i) transformers are more sensitive to spurious features compared to Convnets, (ii) zero-shot CLIP transformers are highly susceptible to spurious features. Next, we introduce a new learning paradigm called Core Risk Minimization (CoRM) whose objective ensures that the model predicts a class using its core features. We evaluate different computational approaches for solving CoRM and achieve significantly higher (+12%) core accuracy (accuracy when non-core regions corrupted using noise) with no drop in clean accuracy compared to models trained via Empirical Risk Minimization.

Via

Access Paper or Ask Questions

Provable Adversarial Robustness for Fractional Lp Threat Models

Mar 16, 2022

Alexander Levine, Soheil Feizi

Figure 1 for Provable Adversarial Robustness for Fractional Lp Threat Models

Figure 2 for Provable Adversarial Robustness for Fractional Lp Threat Models

Figure 3 for Provable Adversarial Robustness for Fractional Lp Threat Models

Figure 4 for Provable Adversarial Robustness for Fractional Lp Threat Models

Abstract:In recent years, researchers have extensively studied adversarial robustness in a variety of threat models, including L_0, L_1, L_2, and L_infinity-norm bounded adversarial attacks. However, attacks bounded by fractional L_p "norms" (quasi-norms defined by the L_p distance with 0<p<1) have yet to be thoroughly considered. We proactively propose a defense with several desirable properties: it provides provable (certified) robustness, scales to ImageNet, and yields deterministic (rather than high-probability) certified guarantees when applied to quantized data (e.g., images). Our technique for fractional L_p robustness constructs expressive, deep classifiers that are globally Lipschitz with respect to the L_p^p metric, for any 0<p<1. However, our method is even more general: we can construct classifiers which are globally Lipschitz with respect to any metric defined as the sum of concave functions of components. Our approach builds on a recent work, Levine and Feizi (2021), which provides a provable defense against L_1 attacks. However, we demonstrate that our proposed guarantees are highly non-vacuous, compared to the trivial solution of using (Levine and Feizi, 2021) directly and applying norm inequalities. Code is available at https://github.com/alevine0/fractionalLpRobustness.

* AISTATS 2022 accepted paper

Via

Access Paper or Ask Questions

Understanding Failure Modes of Self-Supervised Learning

Mar 03, 2022

Neha Mukund Kalibhat, Kanika Narang, Liang Tan, Hamed Firooz, Maziar Sanjabi, Soheil Feizi

Figure 1 for Understanding Failure Modes of Self-Supervised Learning

Figure 2 for Understanding Failure Modes of Self-Supervised Learning

Figure 3 for Understanding Failure Modes of Self-Supervised Learning

Figure 4 for Understanding Failure Modes of Self-Supervised Learning

Abstract:Self-supervised learning methods have shown impressive results in downstream classification tasks. However, there is limited work in understanding their failure models and interpreting the learned representations of these models. In this paper, we tackle these issues and study the representation space of self-supervised models by understanding the underlying reasons for misclassifications in a downstream task. Over several state-of-the-art self-supervised models including SimCLR, SwaV, MoCo V2 and BYOL, we observe that representations of correctly classified samples have few discriminative features with highly deviated values compared to other features. This is in a clear contrast with representations of misclassified samples. We also observe that noisy features in the representation space often correspond to spurious attributes in images making the models less interpretable. Building on these observations, we propose a sample-wise Self-Supervised Representation Quality Score (or, Q-Score) that, without access to any label information, is able to predict if a given sample is likely to be misclassified in the downstream task, achieving an AUPRC of up to 0.90. Q-Score can also be used as a regularization to remedy low-quality representations leading to 3.26% relative improvement in accuracy of SimCLR on ImageNet-100. Moreover, we show that Q-Score regularization increases representation sparsity, thus reducing noise and improving interpretability through gradient heatmaps.

Via

Access Paper or Ask Questions

Improved Certified Defenses against Data Poisoning with (Deterministic) Finite Aggregation

Feb 05, 2022

Wenxiao Wang, Alexander Levine, Soheil Feizi

Figure 1 for Improved Certified Defenses against Data Poisoning with (Deterministic) Finite Aggregation

Figure 2 for Improved Certified Defenses against Data Poisoning with (Deterministic) Finite Aggregation

Figure 3 for Improved Certified Defenses against Data Poisoning with (Deterministic) Finite Aggregation

Figure 4 for Improved Certified Defenses against Data Poisoning with (Deterministic) Finite Aggregation

Abstract:Data poisoning attacks aim at manipulating model behaviors through distorting training data. Previously, an aggregation-based certified defense, Deep Partition Aggregation (DPA), was proposed to mitigate this threat. DPA predicts through an aggregation of base classifiers trained on disjoint subsets of data, thus restricting its sensitivity to dataset distortions. In this work, we propose an improved certified defense against general poisoning attacks, namely Finite Aggregation. In contrast to DPA, which directly splits the training set into disjoint subsets, our method first splits the training set into smaller disjoint subsets and then combines duplicates of them to build larger (but not disjoint) subsets for training base classifiers. This reduces the worst-case impacts of poison samples and thus improves certified robustness bounds. In addition, we offer an alternative view of our method, bridging the designs of deterministic and stochastic aggregation-based certified defenses. Empirically, our proposed Finite Aggregation consistently improves certificates on MNIST, CIFAR-10, and GTSRB, boosting certified fractions by up to 3.05%, 3.87% and 4.77%, respectively, while keeping the same clean accuracies as DPA's, effectively establishing a new state of the art in (pointwise) certified robustness against data poisoning.

Via

Access Paper or Ask Questions

Certifying Model Accuracy under Distribution Shifts

Jan 28, 2022

Aounon Kumar, Alexander Levine, Tom Goldstein, Soheil Feizi

Figure 1 for Certifying Model Accuracy under Distribution Shifts

Figure 2 for Certifying Model Accuracy under Distribution Shifts

Figure 3 for Certifying Model Accuracy under Distribution Shifts

Figure 4 for Certifying Model Accuracy under Distribution Shifts

Abstract:Certified robustness in machine learning has primarily focused on adversarial perturbations of the input with a fixed attack budget for each point in the data distribution. In this work, we present provable robustness guarantees on the accuracy of a model under bounded Wasserstein shifts of the data distribution. We show that a simple procedure that randomizes the input of the model within a transformation space is provably robust to distributional shifts under the transformation. Our framework allows the datum-specific perturbation size to vary across different points in the input distribution and is general enough to include fixed-sized perturbations as well. Our certificates produce guaranteed lower bounds on the performance of the model for any (natural or adversarial) shift of the input distribution within a Wasserstein ball around the original distribution. We apply our technique to: (i) certify robustness against natural (non-adversarial) transformations of images such as color shifts, hue shifts and changes in brightness and saturation, (ii) certify robustness against adversarial shifts of the input distribution, and (iii) show provable lower bounds (hardness results) on the performance of models trained on so-called "unlearnable" datasets that have been poisoned to interfere with model training.

Via

Access Paper or Ask Questions

A Comprehensive Study of Image Classification Model Sensitivity to Foregrounds, Backgrounds, and Visual Attributes

Jan 26, 2022

Mazda Moayeri, Phillip Pope, Yogesh Balaji, Soheil Feizi

Figure 1 for A Comprehensive Study of Image Classification Model Sensitivity to Foregrounds, Backgrounds, and Visual Attributes

Figure 2 for A Comprehensive Study of Image Classification Model Sensitivity to Foregrounds, Backgrounds, and Visual Attributes

Figure 3 for A Comprehensive Study of Image Classification Model Sensitivity to Foregrounds, Backgrounds, and Visual Attributes

Figure 4 for A Comprehensive Study of Image Classification Model Sensitivity to Foregrounds, Backgrounds, and Visual Attributes

Abstract:While datasets with single-label supervision have propelled rapid advances in image classification, additional annotations are necessary in order to quantitatively assess how models make predictions. To this end, for a subset of ImageNet samples, we collect segmentation masks for the entire object and $18$ informative attributes. We call this dataset RIVAL10 (RIch Visual Attributes with Localization), consisting of roughly $26k$ instances over $10$ classes. Using RIVAL10, we evaluate the sensitivity of a broad set of models to noise corruptions in foregrounds, backgrounds and attributes. In our analysis, we consider diverse state-of-the-art architectures (ResNets, Transformers) and training procedures (CLIP, SimCLR, DeiT, Adversarial Training). We find that, somewhat surprisingly, in ResNets, adversarial training makes models more sensitive to the background compared to foreground than standard training. Similarly, contrastively-trained models also have lower relative foreground sensitivity in both transformers and ResNets. Lastly, we observe intriguing adaptive abilities of transformers to increase relative foreground sensitivity as corruption level increases. Using saliency methods, we automatically discover spurious features that drive the background sensitivity of models and assess alignment of saliency maps with foregrounds. Finally, we quantitatively study the attribution problem for neural features by comparing feature saliency with ground-truth localization of semantic attributes.

Via

Access Paper or Ask Questions

Interpolated Joint Space Adversarial Training for Robust and Generalizable Defenses

Dec 12, 2021

Chun Pong Lau, Jiang Liu, Hossein Souri, Wei-An Lin, Soheil Feizi, Rama Chellappa

Figure 1 for Interpolated Joint Space Adversarial Training for Robust and Generalizable Defenses

Figure 2 for Interpolated Joint Space Adversarial Training for Robust and Generalizable Defenses

Figure 3 for Interpolated Joint Space Adversarial Training for Robust and Generalizable Defenses

Figure 4 for Interpolated Joint Space Adversarial Training for Robust and Generalizable Defenses

Abstract:Adversarial training (AT) is considered to be one of the most reliable defenses against adversarial attacks. However, models trained with AT sacrifice standard accuracy and do not generalize well to novel attacks. Recent works show generalization improvement with adversarial samples under novel threat models such as on-manifold threat model or neural perceptual threat model. However, the former requires exact manifold information while the latter requires algorithm relaxation. Motivated by these considerations, we exploit the underlying manifold information with Normalizing Flow, ensuring that exact manifold assumption holds. Moreover, we propose a novel threat model called Joint Space Threat Model (JSTM), which can serve as a special case of the neural perceptual threat model that does not require additional relaxation to craft the corresponding adversarial attacks. Under JSTM, we develop novel adversarial attacks and defenses. The mixup strategy improves the standard accuracy of neural networks but sacrifices robustness when combined with AT. To tackle this issue, we propose the Robust Mixup strategy in which we maximize the adversity of the interpolated images and gain robustness and prevent overfitting. Our experiments show that Interpolated Joint Space Adversarial Training (IJSAT) achieves good performance in standard accuracy, robustness, and generalization in CIFAR-10/100, OM-ImageNet, and CIFAR-10-C datasets. IJSAT is also flexible and can be used as a data augmentation method to improve standard accuracy and combine with many existing AT approaches to improve robustness.

* Under submission

Via

Access Paper or Ask Questions

Mutual Adversarial Training: Learning together is better than going alone

Dec 09, 2021

Jiang Liu, Chun Pong Lau, Hossein Souri, Soheil Feizi, Rama Chellappa

Figure 1 for Mutual Adversarial Training: Learning together is better than going alone

Figure 2 for Mutual Adversarial Training: Learning together is better than going alone

Figure 3 for Mutual Adversarial Training: Learning together is better than going alone

Figure 4 for Mutual Adversarial Training: Learning together is better than going alone

Abstract:Recent studies have shown that robustness to adversarial attacks can be transferred across networks. In other words, we can make a weak model more robust with the help of a strong teacher model. We ask if instead of learning from a static teacher, can models "learn together" and "teach each other" to achieve better robustness? In this paper, we study how interactions among models affect robustness via knowledge distillation. We propose mutual adversarial training (MAT), in which multiple models are trained together and share the knowledge of adversarial examples to achieve improved robustness. MAT allows robust models to explore a larger space of adversarial samples, and find more robust feature spaces and decision boundaries. Through extensive experiments on CIFAR-10 and CIFAR-100, we demonstrate that MAT can effectively improve model robustness and outperform state-of-the-art methods under white-box attacks, bringing $\sim$8% accuracy gain to vanilla adversarial training (AT) under PGD-100 attacks. In addition, we show that MAT can also mitigate the robustness trade-off among different perturbation types, bringing as much as 13.1% accuracy gain to AT baselines against the union of $l_\infty$, $l_2$ and $l_1$ attacks. These results show the superiority of the proposed method and demonstrate that collaborative learning is an effective strategy for designing robust models.

* Under submission

Via

Access Paper or Ask Questions

Segment and Complete: Defending Object Detectors against Adversarial Patch Attacks with Robust Patch Detection

Dec 08, 2021

Jiang Liu, Alexander Levine, Chun Pong Lau, Rama Chellappa, Soheil Feizi

Figure 1 for Segment and Complete: Defending Object Detectors against Adversarial Patch Attacks with Robust Patch Detection

Figure 2 for Segment and Complete: Defending Object Detectors against Adversarial Patch Attacks with Robust Patch Detection

Figure 3 for Segment and Complete: Defending Object Detectors against Adversarial Patch Attacks with Robust Patch Detection

Figure 4 for Segment and Complete: Defending Object Detectors against Adversarial Patch Attacks with Robust Patch Detection

Abstract:Object detection plays a key role in many security-critical systems. Adversarial patch attacks, which are easy to implement in the physical world, pose a serious threat to state-of-the-art object detectors. Developing reliable defenses for object detectors against patch attacks is critical but severely understudied. In this paper, we propose Segment and Complete defense (SAC), a general framework for defending object detectors against patch attacks through detecting and removing adversarial patches. We first train a patch segmenter that outputs patch masks that provide pixel-level localization of adversarial patches. We then propose a self adversarial training algorithm to robustify the patch segmenter. In addition, we design a robust shape completion algorithm, which is guaranteed to remove the entire patch from the images given the outputs of the patch segmenter are within a certain Hamming distance of the ground-truth patch masks. Our experiments on COCO and xView datasets demonstrate that SAC achieves superior robustness even under strong adaptive attacks with no performance drop on clean images, and generalizes well to unseen patch shapes, attack budgets, and unseen attack methods. Furthermore, we present the APRICOT-Mask dataset, which augments the APRICOT dataset with pixel-level annotations of adversarial patches. We show SAC can significantly reduce the targeted attack success rate of physical patch attacks.

* Under submission

Via

Access Paper or Ask Questions

Improving Deep Learning Interpretability by Saliency Guided Training

Nov 29, 2021

Aya Abdelsalam Ismail, Héctor Corrada Bravo, Soheil Feizi

Figure 1 for Improving Deep Learning Interpretability by Saliency Guided Training

Figure 2 for Improving Deep Learning Interpretability by Saliency Guided Training

Figure 3 for Improving Deep Learning Interpretability by Saliency Guided Training

Figure 4 for Improving Deep Learning Interpretability by Saliency Guided Training

Abstract:Saliency methods have been widely used to highlight important input features in model predictions. Most existing methods use backpropagation on a modified gradient function to generate saliency maps. Thus, noisy gradients can result in unfaithful feature attributions. In this paper, we tackle this issue and introduce a {\it saliency guided training}procedure for neural networks to reduce noisy gradients used in predictions while retaining the predictive performance of the model. Our saliency guided training procedure iteratively masks features with small and potentially noisy gradients while maximizing the similarity of model outputs for both masked and unmasked inputs. We apply the saliency guided training procedure to various synthetic and real data sets from computer vision, natural language processing, and time series across diverse neural architectures, including Recurrent Neural Networks, Convolutional Networks, and Transformers. Through qualitative and quantitative evaluations, we show that saliency guided training procedure significantly improves model interpretability across various domains while preserving its predictive performance.

* Thirty-fifth Conference on Neural Information Processing Systems 2021

Via

Access Paper or Ask Questions