Alert button
Picture for Arno Blaas

Arno Blaas

Alert button

The Role of Entropy and Reconstruction in Multi-View Self-Supervised Learning

Jul 20, 2023
Borja Rodríguez-Gálvez, Arno Blaas, Pau Rodríguez, Adam Goliński, Xavier Suau, Jason Ramapuram, Dan Busbridge, Luca Zappella

The mechanisms behind the success of multi-view self-supervised learning (MVSSL) are not yet fully understood. Contrastive MVSSL methods have been studied through the lens of InfoNCE, a lower bound of the Mutual Information (MI). However, the relation between other MVSSL methods and MI remains unclear. We consider a different lower bound on the MI consisting of an entropy and a reconstruction term (ER), and analyze the main MVSSL families through its lens. Through this ER bound, we show that clustering-based methods such as DeepCluster and SwAV maximize the MI. We also re-interpret the mechanisms of distillation-based approaches such as BYOL and DINO, showing that they explicitly maximize the reconstruction term and implicitly encourage a stable entropy, and we confirm this empirically. We show that replacing the objectives of common MVSSL methods with this ER bound achieves competitive performance, while making them stable when training with smaller batch sizes or smaller exponential moving average (EMA) coefficients. Github repo: https://github.com/apple/ml-entropy-reconstruction.

* 18 pages: 9 of main text, 2 of references, and 7 of supplementary material. Appears in the proceedings of ICML 2023 
Viaarxiv icon

DUET: 2D Structured and Approximately Equivariant Representations

Jun 30, 2023
Xavier Suau, Federico Danieli, T. Anderson Keller, Arno Blaas, Chen Huang, Jason Ramapuram, Dan Busbridge, Luca Zappella

Figure 1 for DUET: 2D Structured and Approximately Equivariant Representations
Figure 2 for DUET: 2D Structured and Approximately Equivariant Representations
Figure 3 for DUET: 2D Structured and Approximately Equivariant Representations
Figure 4 for DUET: 2D Structured and Approximately Equivariant Representations

Multiview Self-Supervised Learning (MSSL) is based on learning invariances with respect to a set of input transformations. However, invariance partially or totally removes transformation-related information from the representations, which might harm performance for specific downstream tasks that require such information. We propose 2D strUctured and EquivarianT representations (coined DUET), which are 2d representations organized in a matrix structure, and equivariant with respect to transformations acting on the input data. DUET representations maintain information about an input transformation, while remaining semantically expressive. Compared to SimCLR (Chen et al., 2020) (unstructured and invariant) and ESSL (Dangovski et al., 2022) (unstructured and equivariant), the structured and equivariant nature of DUET representations enables controlled generation with lower reconstruction error, while controllability is not possible with SimCLR or ESSL. DUET also achieves higher accuracy for several discriminative tasks, and improves transfer learning.

* Accepted at ICML 2023 
Viaarxiv icon

Challenges of Adversarial Image Augmentations

Dec 03, 2021
Arno Blaas, Xavier Suau, Jason Ramapuram, Nicholas Apostoloff, Luca Zappella

Figure 1 for Challenges of Adversarial Image Augmentations
Figure 2 for Challenges of Adversarial Image Augmentations
Figure 3 for Challenges of Adversarial Image Augmentations
Figure 4 for Challenges of Adversarial Image Augmentations

Image augmentations applied during training are crucial for the generalization performance of image classifiers. Therefore, a large body of research has focused on finding the optimal augmentation policy for a given task. Yet, RandAugment [2], a simple random augmentation policy, has recently been shown to outperform existing sophisticated policies. Only Adversarial AutoAugment (AdvAA) [11], an approach based on the idea of adversarial training, has shown to be better than RandAugment. In this paper, we show that random augmentations are still competitive compared to an optimal adversarial approach, as well as to simple curricula, and conjecture that the success of AdvAA is due to the stochasticity of the policy controller network, which introduces a mild form of curriculum.

* To appear at the ICBINB 2021 Neurips Workshop 
Viaarxiv icon

Adversarial Attacks on Graph Classification via Bayesian Optimisation

Nov 04, 2021
Xingchen Wan, Henry Kenlay, Binxin Ru, Arno Blaas, Michael A. Osborne, Xiaowen Dong

Figure 1 for Adversarial Attacks on Graph Classification via Bayesian Optimisation
Figure 2 for Adversarial Attacks on Graph Classification via Bayesian Optimisation
Figure 3 for Adversarial Attacks on Graph Classification via Bayesian Optimisation
Figure 4 for Adversarial Attacks on Graph Classification via Bayesian Optimisation

Graph neural networks, a popular class of models effective in a wide range of graph-based learning tasks, have been shown to be vulnerable to adversarial attacks. While the majority of the literature focuses on such vulnerability in node-level classification tasks, little effort has been dedicated to analysing adversarial attacks on graph-level classification, an important problem with numerous real-life applications such as biochemistry and social network analysis. The few existing methods often require unrealistic setups, such as access to internal information of the victim models, or an impractically-large number of queries. We present a novel Bayesian optimisation-based attack method for graph classification models. Our method is black-box, query-efficient and parsimonious with respect to the perturbation applied. We empirically validate the effectiveness and flexibility of the proposed method on a wide range of graph classification tasks involving varying graph properties, constraints and modes of attack. Finally, we analyse common interpretable patterns behind the adversarial samples produced, which may shed further light on the adversarial robustness of graph classification models.

* NeurIPS 2021. 11 pages, 8 figures, 2 tables (24 pages, 17 figures, 8 tables including references and appendices) 
Viaarxiv icon

On Invariance Penalties for Risk Minimization

Jun 17, 2021
Kia Khezeli, Arno Blaas, Frank Soboczenski, Nicholas Chia, John Kalantari

Figure 1 for On Invariance Penalties for Risk Minimization
Figure 2 for On Invariance Penalties for Risk Minimization
Figure 3 for On Invariance Penalties for Risk Minimization

The Invariant Risk Minimization (IRM) principle was first proposed by Arjovsky et al. [2019] to address the domain generalization problem by leveraging data heterogeneity from differing experimental conditions. Specifically, IRM seeks to find a data representation under which an optimal classifier remains invariant across all domains. Despite the conceptual appeal of IRM, the effectiveness of the originally proposed invariance penalty has recently been brought into question. In particular, there exists counterexamples for which that invariance penalty can be arbitrarily small for non-invariant data representations. We propose an alternative invariance penalty by revisiting the Gramian matrix of the data representation. We discuss the role of its eigenvalues in the relationship between the risk and the invariance penalty, and demonstrate that it is ill-conditioned for said counterexamples. The proposed approach is guaranteed to recover an invariant representation for linear settings under mild non-degeneracy conditions. Its effectiveness is substantiated by experiments on DomainBed and InvarianceUnitTest, two extensive test beds for domain generalization.

Viaarxiv icon

Adversarial Robustness Guarantees for Gaussian Processes

Apr 07, 2021
Andrea Patane, Arno Blaas, Luca Laurenti, Luca Cardelli, Stephen Roberts, Marta Kwiatkowska

Figure 1 for Adversarial Robustness Guarantees for Gaussian Processes
Figure 2 for Adversarial Robustness Guarantees for Gaussian Processes
Figure 3 for Adversarial Robustness Guarantees for Gaussian Processes
Figure 4 for Adversarial Robustness Guarantees for Gaussian Processes

Gaussian processes (GPs) enable principled computation of model uncertainty, making them attractive for safety-critical applications. Such scenarios demand that GP decisions are not only accurate, but also robust to perturbations. In this paper we present a framework to analyse adversarial robustness of GPs, defined as invariance of the model's decision to bounded perturbations. Given a compact subset of the input space $T\subseteq \mathbb{R}^d$, a point $x^*$ and a GP, we provide provable guarantees of adversarial robustness of the GP by computing lower and upper bounds on its prediction range in $T$. We develop a branch-and-bound scheme to refine the bounds and show, for any $\epsilon > 0$, that our algorithm is guaranteed to converge to values $\epsilon$-close to the actual values in finitely many iterations. The algorithm is anytime and can handle both regression and classification tasks, with analytical formulation for most kernels used in practice. We evaluate our methods on a collection of synthetic and standard benchmark datasets, including SPAM, MNIST and FashionMNIST. We study the effect of approximate inference techniques on robustness and demonstrate how our method can be used for interpretability. Our empirical results suggest that the adversarial robustness of GPs increases with accurate posterior estimation.

* Submitted for publication 
Viaarxiv icon

The Effect of Prior Lipschitz Continuity on the Adversarial Robustness of Bayesian Neural Networks

Jan 07, 2021
Arno Blaas, Stephen J. Roberts

Figure 1 for The Effect of Prior Lipschitz Continuity on the Adversarial Robustness of Bayesian Neural Networks
Figure 2 for The Effect of Prior Lipschitz Continuity on the Adversarial Robustness of Bayesian Neural Networks

It is desirable, and often a necessity, for machine learning models to be robust against adversarial attacks. This is particularly true for Bayesian models, as they are well-suited for safety-critical applications, in which adversarial attacks can have catastrophic outcomes. In this work, we take a deeper look at the adversarial robustness of Bayesian Neural Networks (BNNs). In particular, we consider whether the adversarial robustness of a BNN can be increased by model choices, particularly the Lipschitz continuity induced by the prior. Conducting in-depth analysis on the case of i.i.d., zero-mean Gaussian priors and posteriors approximated via mean-field variational inference, we find evidence that adversarial robustness is indeed sensitive to the prior variance.

* 4 pages, 2 tables, AAAI 2021 Workshop Towards Robust, Secure and Efficient Machine Learning 
Viaarxiv icon

Robustness Quantification for Classification with Gaussian Processes

May 28, 2019
Arno Blaas, Luca Laurenti, Andrea Patane, Luca Cardelli, Marta Kwiatkowska, Stephen Roberts

Figure 1 for Robustness Quantification for Classification with Gaussian Processes
Figure 2 for Robustness Quantification for Classification with Gaussian Processes
Figure 3 for Robustness Quantification for Classification with Gaussian Processes

We consider Bayesian classification with Gaussian processes (GPs) and define robustness of a classifier in terms of the worst-case difference in the classification probabilities with respect to input perturbations. For a subset of the input space $T\subseteq \mathbb{R}^m$ such properties reduce to computing the infimum and supremum of the classification probabilities for all points in $T$. Unfortunately, computing the above values is very challenging, as the classification probabilities cannot be expressed analytically. Nevertheless, using the theory of Gaussian processes, we develop a framework that, for a given dataset $\mathcal{D}$, a compact set of input points $T\subseteq \mathbb{R}^m$ and an error threshold $\epsilon>0$, computes lower and upper bounds of the classification probabilities by over-approximating the exact range with an error bounded by $\epsilon$. We provide experimental comparison of several approximate inference methods for classification on tasks associated to MNIST and SPAM datasets showing that our results enable quantification of uncertainty in adversarial classification settings.

* 10 pages, 3 figures + Appendix 
Viaarxiv icon