Alert button
Picture for David Stutz

David Stutz

Alert button

Dj

Certified Robust Models with Slack Control and Large Lipschitz Constants

Sep 12, 2023
Max Losch, David Stutz, Bernt Schiele, Mario Fritz

Despite recent success, state-of-the-art learning-based models remain highly vulnerable to input changes such as adversarial examples. In order to obtain certifiable robustness against such perturbations, recent work considers Lipschitz-based regularizers or constraints while at the same time increasing prediction margin. Unfortunately, this comes at the cost of significantly decreased accuracy. In this paper, we propose a Calibrated Lipschitz-Margin Loss (CLL) that addresses this issue and improves certified robustness by tackling two problems: Firstly, commonly used margin losses do not adjust the penalties to the shrinking output distribution; caused by minimizing the Lipschitz constant $K$. Secondly, and most importantly, we observe that minimization of $K$ can lead to overly smooth decision functions. This limits the model's complexity and thus reduces accuracy. Our CLL addresses these issues by explicitly calibrating the loss w.r.t. margin and Lipschitz constant, thereby establishing full control over slack and improving robustness certificates even with larger Lipschitz constants. On CIFAR-10, CIFAR-100 and Tiny-ImageNet, our models consistently outperform losses that leave the constant unattended. On CIFAR-100 and Tiny-ImageNet, CLL improves upon state-of-the-art deterministic $L_2$ robust accuracies. In contrast to current trends, we unlock potential of much smaller models without $K=1$ constraints.

* To be published at GCPR 2023 
Viaarxiv icon

Unlocking Accuracy and Fairness in Differentially Private Image Classification

Aug 21, 2023
Leonard Berrada, Soham De, Judy Hanwen Shen, Jamie Hayes, Robert Stanforth, David Stutz, Pushmeet Kohli, Samuel L. Smith, Borja Balle

Figure 1 for Unlocking Accuracy and Fairness in Differentially Private Image Classification
Figure 2 for Unlocking Accuracy and Fairness in Differentially Private Image Classification
Figure 3 for Unlocking Accuracy and Fairness in Differentially Private Image Classification
Figure 4 for Unlocking Accuracy and Fairness in Differentially Private Image Classification

Privacy-preserving machine learning aims to train models on private data without leaking sensitive information. Differential privacy (DP) is considered the gold standard framework for privacy-preserving training, as it provides formal privacy guarantees. However, compared to their non-private counterparts, models trained with DP often have significantly reduced accuracy. Private classifiers are also believed to exhibit larger performance disparities across subpopulations, raising fairness concerns. The poor performance of classifiers trained with DP has prevented the widespread adoption of privacy preserving machine learning in industry. Here we show that pre-trained foundation models fine-tuned with DP can achieve similar accuracy to non-private classifiers, even in the presence of significant distribution shifts between pre-training data and downstream tasks. We achieve private accuracies within a few percent of the non-private state of the art across four datasets, including two medical imaging benchmarks. Furthermore, our private medical classifiers do not exhibit larger performance disparities across demographic groups than non-private models. This milestone to make DP training a practical and reliable technology has the potential to widely enable machine learning practitioners to train safely on sensitive datasets while protecting individuals' privacy.

Viaarxiv icon

Conformal prediction under ambiguous ground truth

Jul 18, 2023
David Stutz, Abhijit Guha Roy, Tatiana Matejovicova, Patricia Strachan, Ali Taylan Cemgil, Arnaud Doucet

In safety-critical classification tasks, conformal prediction allows to perform rigorous uncertainty quantification by providing confidence sets including the true class with a user-specified probability. This generally assumes the availability of a held-out calibration set with access to ground truth labels. Unfortunately, in many domains, such labels are difficult to obtain and usually approximated by aggregating expert opinions. In fact, this holds true for almost all datasets, including well-known ones such as CIFAR and ImageNet. Applying conformal prediction using such labels underestimates uncertainty. Indeed, when expert opinions are not resolvable, there is inherent ambiguity present in the labels. That is, we do not have ``crisp'', definitive ground truth labels and this uncertainty should be taken into account during calibration. In this paper, we develop a conformal prediction framework for such ambiguous ground truth settings which relies on an approximation of the underlying posterior distribution of labels given inputs. We demonstrate our methodology on synthetic and real datasets, including a case study of skin condition classification in dermatology.

Viaarxiv icon

Evaluating AI systems under uncertain ground truth: a case study in dermatology

Jul 05, 2023
David Stutz, Ali Taylan Cemgil, Abhijit Guha Roy, Tatiana Matejovicova, Melih Barsbey, Patricia Strachan, Mike Schaekermann, Jan Freyberg, Rajeev Rikhye, Beverly Freeman, Javier Perez Matos, Umesh Telang, Dale R. Webster, Yuan Liu, Greg S. Corrado, Yossi Matias, Pushmeet Kohli, Yun Liu, Arnaud Doucet, Alan Karthikesalingam

Figure 1 for Evaluating AI systems under uncertain ground truth: a case study in dermatology
Figure 2 for Evaluating AI systems under uncertain ground truth: a case study in dermatology
Figure 3 for Evaluating AI systems under uncertain ground truth: a case study in dermatology
Figure 4 for Evaluating AI systems under uncertain ground truth: a case study in dermatology

For safety, AI systems in health undergo thorough evaluations before deployment, validating their predictions against a ground truth that is assumed certain. However, this is actually not the case and the ground truth may be uncertain. Unfortunately, this is largely ignored in standard evaluation of AI models but can have severe consequences such as overestimating the future performance. To avoid this, we measure the effects of ground truth uncertainty, which we assume decomposes into two main components: annotation uncertainty which stems from the lack of reliable annotations, and inherent uncertainty due to limited observational information. This ground truth uncertainty is ignored when estimating the ground truth by deterministically aggregating annotations, e.g., by majority voting or averaging. In contrast, we propose a framework where aggregation is done using a statistical model. Specifically, we frame aggregation of annotations as posterior inference of so-called plausibilities, representing distributions over classes in a classification setting, subject to a hyper-parameter encoding annotator reliability. Based on this model, we propose a metric for measuring annotation uncertainty and provide uncertainty-adjusted metrics for performance evaluation. We present a case study applying our framework to skin condition classification from images where annotations are provided in the form of differential diagnoses. The deterministic adjudication process called inverse rank normalization (IRN) from previous work ignores ground truth uncertainty in evaluation. Instead, we present two alternative statistical models: a probabilistic version of IRN and a Plackett-Luce-based model. We find that a large portion of the dataset exhibits significant ground truth uncertainty and standard IRN-based evaluation severely over-estimates performance without providing uncertainty estimates.

Viaarxiv icon

Robustifying Token Attention for Vision Transformers

Apr 04, 2023
Yong Guo, David Stutz, Bernt Schiele

Figure 1 for Robustifying Token Attention for Vision Transformers
Figure 2 for Robustifying Token Attention for Vision Transformers
Figure 3 for Robustifying Token Attention for Vision Transformers
Figure 4 for Robustifying Token Attention for Vision Transformers

Despite the success of vision transformers (ViTs), they still suffer from significant drops in accuracy in the presence of common corruptions, such as noise or blur. Interestingly, we observe that the attention mechanism of ViTs tends to rely on few important tokens, a phenomenon we call token overfocusing. More critically, these tokens are not robust to corruptions, often leading to highly diverging attention patterns. In this paper, we intend to alleviate this overfocusing issue and make attention more stable through two general techniques: First, our Token-aware Average Pooling (TAP) module encourages the local neighborhood of each token to take part in the attention mechanism. Specifically, TAP learns average pooling schemes for each token such that the information of potentially important tokens in the neighborhood can adaptively be taken into account. Second, we force the output tokens to aggregate information from a diverse set of input tokens rather than focusing on just a few by using our Attention Diversification Loss (ADL). We achieve this by penalizing high cosine similarity between the attention vectors of different tokens. In experiments, we apply our methods to a wide range of transformer architectures and improve robustness significantly. For example, we improve corruption robustness on ImageNet-C by 2.4% while simultaneously improving accuracy by 0.4% based on state-of-the-art robust architecture FAN. Also, when finetuning on semantic segmentation tasks, we improve robustness on CityScapes-C by 2.4% and ACDC by 3.1%.

* 7 figures and 5 tables in the main paper 
Viaarxiv icon

On Fragile Features and Batch Normalization in Adversarial Training

Apr 26, 2022
Nils Philipp Walter, David Stutz, Bernt Schiele

Figure 1 for On Fragile Features and Batch Normalization in Adversarial Training
Figure 2 for On Fragile Features and Batch Normalization in Adversarial Training
Figure 3 for On Fragile Features and Batch Normalization in Adversarial Training
Figure 4 for On Fragile Features and Batch Normalization in Adversarial Training

Modern deep learning architecture utilize batch normalization (BN) to stabilize training and improve accuracy. It has been shown that the BN layers alone are surprisingly expressive. In the context of robustness against adversarial examples, however, BN is argued to increase vulnerability. That is, BN helps to learn fragile features. Nevertheless, BN is still used in adversarial training, which is the de-facto standard to learn robust features. In order to shed light on the role of BN in adversarial training, we investigate to what extent the expressiveness of BN can be used to robustify fragile features in comparison to random features. On CIFAR10, we find that adversarially fine-tuning just the BN layers can result in non-trivial adversarial robustness. Adversarially training only the BN layers from scratch, in contrast, is not able to convey meaningful adversarial robustness. Our results indicate that fragile features can be used to learn models with moderate adversarial robustness, while random features cannot

Viaarxiv icon

Improving Corruption and Adversarial Robustness by Enhancing Weak Subnets

Jan 30, 2022
Yong Guo, David Stutz, Bernt Schiele

Figure 1 for Improving Corruption and Adversarial Robustness by Enhancing Weak Subnets
Figure 2 for Improving Corruption and Adversarial Robustness by Enhancing Weak Subnets
Figure 3 for Improving Corruption and Adversarial Robustness by Enhancing Weak Subnets
Figure 4 for Improving Corruption and Adversarial Robustness by Enhancing Weak Subnets

Deep neural networks have achieved great success in many computer vision tasks. However, deep networks have been shown to be very susceptible to corrupted or adversarial images, which often result in significant performance drops. In this paper, we observe that weak subnetwork (subnet) performance is correlated with a lack of robustness against corruptions and adversarial attacks. Based on that observation, we propose a novel robust training method which explicitly identifies and enhances weak subnets (EWS) during training to improve robustness. Specifically, we develop a search algorithm to find particularly weak subnets and propose to explicitly strengthen them via knowledge distillation from the full network. We show that our EWS greatly improves the robustness against corrupted images as well as the accuracy on clean data. Being complementary to many state-of-the-art data augmentation approaches, EWS consistently improves corruption robustness on top of many of these approaches. Moreover, EWS is also able to boost the adversarial robustness when combined with popular adversarial training methods.

Viaarxiv icon

Learning Optimal Conformal Classifiers

Oct 18, 2021
David Stutz, Krishnamurthy, Dvijotham, Ali Taylan Cemgil, Arnaud Doucet

Figure 1 for Learning Optimal Conformal Classifiers
Figure 2 for Learning Optimal Conformal Classifiers
Figure 3 for Learning Optimal Conformal Classifiers
Figure 4 for Learning Optimal Conformal Classifiers

Modern deep learning based classifiers show very high accuracy on test data but this does not provide sufficient guarantees for safe deployment, especially in high-stake AI applications such as medical diagnosis. Usually, predictions are obtained without a reliable uncertainty estimate or a formal guarantee. Conformal prediction (CP) addresses these issues by using the classifier's probability estimates to predict confidence sets containing the true class with a user-specified probability. However, using CP as a separate processing step after training prevents the underlying model from adapting to the prediction of confidence sets. Thus, this paper explores strategies to differentiate through CP during training with the goal of training model with the conformal wrapper end-to-end. In our approach, conformal training (ConfTr), we specifically "simulate" conformalization on mini-batches during training. We show that CT outperforms state-of-the-art CP methods for classification by reducing the average confidence set size (inefficiency). Moreover, it allows to "shape" the confidence sets predicted at test time, which is difficult for standard CP. On experiments with several datasets, we show ConfTr can influence how inefficiency is distributed across classes, or guide the composition of confidence sets in terms of the included classes, while retaining the guarantees offered by CP.

Viaarxiv icon

A Closer Look at the Adversarial Robustness of Information Bottleneck Models

Jul 12, 2021
Iryna Korshunova, David Stutz, Alexander A. Alemi, Olivia Wiles, Sven Gowal

Figure 1 for A Closer Look at the Adversarial Robustness of Information Bottleneck Models
Figure 2 for A Closer Look at the Adversarial Robustness of Information Bottleneck Models
Figure 3 for A Closer Look at the Adversarial Robustness of Information Bottleneck Models
Figure 4 for A Closer Look at the Adversarial Robustness of Information Bottleneck Models

We study the adversarial robustness of information bottleneck models for classification. Previous works showed that the robustness of models trained with information bottlenecks can improve upon adversarial training. Our evaluation under a diverse range of white-box $l_{\infty}$ attacks suggests that information bottlenecks alone are not a strong defense strategy, and that previous results were likely influenced by gradient obfuscation.

Viaarxiv icon