Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jayaraman J. Thiagarajan

Lawrence Livermore National Laboratory, Livermore, CA

Out of Distribution Detection via Neural Network Anchoring

Jul 08, 2022
Rushil Anirudh, Jayaraman J. Thiagarajan

Figure 1 for Out of Distribution Detection via Neural Network Anchoring

Figure 2 for Out of Distribution Detection via Neural Network Anchoring

Figure 3 for Out of Distribution Detection via Neural Network Anchoring

Figure 4 for Out of Distribution Detection via Neural Network Anchoring

Our goal in this paper is to exploit heteroscedastic temperature scaling as a calibration strategy for out of distribution (OOD) detection. Heteroscedasticity here refers to the fact that the optimal temperature parameter for each sample can be different, as opposed to conventional approaches that use the same value for the entire distribution. To enable this, we propose a new training strategy called anchoring that can estimate appropriate temperature values for each sample, leading to state-of-the-art OOD detection performance across several benchmarks. Using NTK theory, we show that this temperature function estimate is closely linked to the epistemic uncertainty of the classifier, which explains its behavior. In contrast to some of the best-performing OOD detection approaches, our method does not require exposure to additional outlier datasets, custom calibration objectives, or model ensembling. Through empirical studies with different OOD detection settings -- far OOD, near OOD, and semantically coherent OOD - we establish a highly effective OOD detection approach. Code and models can be accessed here -- https://github.com/rushilanirudh/AMP

Via

Access Paper or Ask Questions

Improving Diversity with Adversarially Learned Transformations for Domain Generalization

Jun 15, 2022
Tejas Gokhale, Rushil Anirudh, Jayaraman J. Thiagarajan, Bhavya Kailkhura, Chitta Baral, Yezhou Yang

Figure 1 for Improving Diversity with Adversarially Learned Transformations for Domain Generalization

Figure 2 for Improving Diversity with Adversarially Learned Transformations for Domain Generalization

Figure 3 for Improving Diversity with Adversarially Learned Transformations for Domain Generalization

Figure 4 for Improving Diversity with Adversarially Learned Transformations for Domain Generalization

To be successful in single source domain generalization, maximizing diversity of synthesized domains has emerged as one of the most effective strategies. Many of the recent successes have come from methods that pre-specify the types of diversity that a model is exposed to during training, so that it can ultimately generalize well to new domains. However, na\"ive diversity based augmentations do not work effectively for domain generalization either because they cannot model large domain shift, or because the span of transforms that are pre-specified do not cover the types of shift commonly occurring in domain generalization. To address this issue, we present a novel framework that uses adversarially learned transformations (ALT) using a neural network to model plausible, yet hard image transformations that fool the classifier. This network is randomly initialized for each batch and trained for a fixed number of steps to maximize classification error. Further, we enforce consistency between the classifier's predictions on the clean and transformed images. With extensive empirical analysis, we find that this new form of adversarial transformations achieve both objectives of diversity and hardness simultaneously, outperforming all existing techniques on competitive benchmarks for single source domain generalization. We also show that ALT can naturally work with existing diversity modules to produce highly distinct, and large transformations of the source domain leading to state-of-the-art performance.

* Code for ALT is available at https://github.com/tejas-gokhale/ALT

Via

Access Paper or Ask Questions

Improving Multi-Domain Generalization through Domain Re-labeling

Dec 17, 2021
Kowshik Thopalli, Sameeksha Katoch, Andreas Spanias, Pavan Turaga, Jayaraman J. Thiagarajan

Figure 1 for Improving Multi-Domain Generalization through Domain Re-labeling

Figure 2 for Improving Multi-Domain Generalization through Domain Re-labeling

Figure 3 for Improving Multi-Domain Generalization through Domain Re-labeling

Figure 4 for Improving Multi-Domain Generalization through Domain Re-labeling

Domain generalization (DG) methods aim to develop models that generalize to settings where the test distribution is different from the training data. In this paper, we focus on the challenging problem of multi-source zero-shot DG, where labeled training data from multiple source domains is available but with no access to data from the target domain. Though this problem has become an important topic of research, surprisingly, the simple solution of pooling all source data together and training a single classifier is highly competitive on standard benchmarks. More importantly, even sophisticated approaches that explicitly optimize for invariance across different domains do not necessarily provide non-trivial gains over ERM. In this paper, for the first time, we study the important link between pre-specified domain labels and the generalization performance. Using a motivating case-study and a new variant of a distributional robust optimization algorithm, GroupDRO++, we first demonstrate how inferring custom domain groups can lead to consistent improvements over the original domain labels that come with the dataset. Subsequently, we introduce a general approach for multi-domain generalization, MulDEns, that uses an ERM-based deep ensembling backbone and performs implicit domain re-labeling through a meta-optimization algorithm. Using empirical studies on multiple standard benchmarks, we show that MulDEns does not require tailoring the augmentation strategy or the training process specific to a dataset, consistently outperforms ERM by significant margins, and produces state-of-the-art generalization performance, even when compared to existing methods that exploit the domain labels.

Via

Access Paper or Ask Questions

Geometric Priors for Scientific Generative Models in Inertial Confinement Fusion

Nov 24, 2021
Ankita Shukla, Rushil Anirudh, Eugene Kur, Jayaraman J. Thiagarajan, Peer-Timo Bremer, Brian K. Spears, Tammy Ma, Pavan Turaga

Figure 1 for Geometric Priors for Scientific Generative Models in Inertial Confinement Fusion

Figure 2 for Geometric Priors for Scientific Generative Models in Inertial Confinement Fusion

Figure 3 for Geometric Priors for Scientific Generative Models in Inertial Confinement Fusion

Figure 4 for Geometric Priors for Scientific Generative Models in Inertial Confinement Fusion

In this paper, we develop a Wasserstein autoencoder (WAE) with a hyperspherical prior for multimodal data in the application of inertial confinement fusion. Unlike a typical hyperspherical generative model that requires computationally inefficient sampling from distributions like the von Mis Fisher, we sample from a normal distribution followed by a projection layer before the generator. Finally, to determine the validity of the generated samples, we exploit a known relationship between the modalities in the dataset as a scientific constraint, and study different properties of the proposed model.

* 5 pages, 4 figures, Fourth Workshop on Machine Learning and the Physical Sciences, NeurIPS 2021

Via

Access Paper or Ask Questions

MedPerf: Open Benchmarking Platform for Medical Artificial Intelligence using Federated Evaluation

Oct 08, 2021
Alexandros Karargyris, Renato Umeton, Micah J. Sheller, Alejandro Aristizabal, Johnu George, Srini Bala, Daniel J. Beutel, Victor Bittorf, Akshay Chaudhari, Alexander Chowdhury, Cody Coleman, Bala Desinghu, Gregory Diamos, Debo Dutta, Diane Feddema, Grigori Fursin, Junyi Guo, Xinyuan Huang, David Kanter, Satyananda Kashyap, Nicholas Lane, Indranil Mallick, Pietro Mascagni, Virendra Mehta, Vivek Natarajan, Nikola Nikolov, Nicolas Padoy, Gennady Pekhimenko, Vijay Janapa Reddi, G Anthony Reina, Pablo Ribalta, Jacob Rosenthal, Abhishek Singh, Jayaraman J. Thiagarajan, Anna Wuest, Maria Xenochristou, Daguang Xu, Poonam Yadav, Michael Rosenthal, Massimo Loda, Jason M. Johnson, Peter Mattson

Figure 1 for MedPerf: Open Benchmarking Platform for Medical Artificial Intelligence using Federated Evaluation

Figure 2 for MedPerf: Open Benchmarking Platform for Medical Artificial Intelligence using Federated Evaluation

Figure 3 for MedPerf: Open Benchmarking Platform for Medical Artificial Intelligence using Federated Evaluation

Figure 4 for MedPerf: Open Benchmarking Platform for Medical Artificial Intelligence using Federated Evaluation

Medical AI has tremendous potential to advance healthcare by supporting the evidence-based practice of medicine, personalizing patient treatment, reducing costs, and improving provider and patient experience. We argue that unlocking this potential requires a systematic way to measure the performance of medical AI models on large-scale heterogeneous data. To meet this need, we are building MedPerf, an open framework for benchmarking machine learning in the medical domain. MedPerf will enable federated evaluation in which models are securely distributed to different facilities for evaluation, thereby empowering healthcare organizations to assess and verify the performance of AI models in an efficient and human-supervised process, while prioritizing privacy. We describe the current challenges healthcare and AI communities face, the need for an open platform, the design philosophy of MedPerf, its current implementation status, and our roadmap. We call for researchers and organizations to join us in creating the MedPerf open benchmarking platform.

Via

Access Paper or Ask Questions

$Δ$-UQ: Accurate Uncertainty Quantification via Anchor Marginalization

Oct 05, 2021
Rushil Anirudh, Jayaraman J. Thiagarajan

Figure 1 for $Δ$-UQ: Accurate Uncertainty Quantification via Anchor Marginalization

Figure 2 for $Δ$-UQ: Accurate Uncertainty Quantification via Anchor Marginalization

Figure 3 for $Δ$-UQ: Accurate Uncertainty Quantification via Anchor Marginalization

Figure 4 for $Δ$-UQ: Accurate Uncertainty Quantification via Anchor Marginalization

We present $\Delta$-UQ -- a novel, general-purpose uncertainty estimator using the concept of anchoring in predictive models. Anchoring works by first transforming the input into a tuple consisting of an anchor point drawn from a prior distribution, and a combination of the input sample with the anchor using a pretext encoding scheme. This encoding is such that the original input can be perfectly recovered from the tuple -- regardless of the choice of the anchor. Therefore, any predictive model should be able to predict the target response from the tuple alone (since it implicitly represents the input). Moreover, by varying the anchors for a fixed sample, we can estimate uncertainty in the prediction even using only a single predictive model. We find this uncertainty is deeply connected to improper sampling of the input data, and inherent noise, enabling us to estimate the total uncertainty in any system. With extensive empirical studies on a variety of use-cases, we demonstrate that $\Delta$-UQ outperforms several competitive baselines. Specifically, we study model fitting, sequential model optimization, model based inversion in the regression setting and out of distribution detection, & calibration under distribution shifts for classification.

Via

Access Paper or Ask Questions

Designing Counterfactual Generators using Deep Model Inversion

Oct 05, 2021
Jayaraman J. Thiagarajan, Vivek Narayanaswamy, Deepta Rajan, Jason Liang, Akshay Chaudhari, Andreas Spanias

Figure 1 for Designing Counterfactual Generators using Deep Model Inversion

Figure 2 for Designing Counterfactual Generators using Deep Model Inversion

Figure 3 for Designing Counterfactual Generators using Deep Model Inversion

Figure 4 for Designing Counterfactual Generators using Deep Model Inversion

Explanation techniques that synthesize small, interpretable changes to a given image while producing desired changes in the model prediction have become popular for introspecting black-box models. Commonly referred to as counterfactuals, the synthesized explanations are required to contain discernible changes (for easy interpretability) while also being realistic (consistency to the data manifold). In this paper, we focus on the case where we have access only to the trained deep classifier and not the actual training data. While the problem of inverting deep models to synthesize images from the training distribution has been explored, our goal is to develop a deep inversion approach to generate counterfactual explanations for a given query image. Despite their effectiveness in conditional image synthesis, we show that existing deep inversion methods are insufficient for producing meaningful counterfactuals. We propose DISC (Deep Inversion for Synthesizing Counterfactuals) that improves upon deep inversion by utilizing (a) stronger image priors, (b) incorporating a novel manifold consistency objective and (c) adopting a progressive optimization strategy. We find that, in addition to producing visually meaningful explanations, the counterfactuals from DISC are effective at learning classifier decision boundaries and are robust to unknown test-time corruptions.

* Neurips 2021

Via

Access Paper or Ask Questions

Transfer learning suppresses simulation bias in predictive models built from sparse, multi-modal data

Apr 19, 2021
Bogdan Kustowski, Jim A. Gaffney, Brian K. Spears, Gemma J. Anderson, Rushil Anirudh, Peer-Timo Bremer, Jayaraman J. Thiagarajan

Figure 1 for Transfer learning suppresses simulation bias in predictive models built from sparse, multi-modal data

Figure 2 for Transfer learning suppresses simulation bias in predictive models built from sparse, multi-modal data

Figure 3 for Transfer learning suppresses simulation bias in predictive models built from sparse, multi-modal data

Figure 4 for Transfer learning suppresses simulation bias in predictive models built from sparse, multi-modal data

Many problems in science, engineering, and business require making predictions based on very few observations. To build a robust predictive model, these sparse data may need to be augmented with simulated data, especially when the design space is multidimensional. Simulations, however, often suffer from an inherent bias. Estimation of this bias may be poorly constrained not only because of data sparsity, but also because traditional predictive models fit only one type of observations, such as scalars or images, instead of all available data modalities, which might have been acquired and simulated at great cost. We combine recent developments in deep learning to build more robust predictive models from multimodal data with a recent, novel technique to suppress the bias, and extend it to take into account multiple data modalities. First, an initial, simulation-trained, neural network surrogate model learns important correlations between different data modalities and between simulation inputs and outputs. Then, the model is partially retrained, or transfer learned, to fit the observations. Using fewer than 10 inertial confinement fusion experiments for retraining, we demonstrate that this technique systematically improves simulation predictions while a simple output calibration makes predictions worse. We also offer extensive cross-validation with real and synthetic data to support our findings. The transfer learning method can be applied to other problems that require transferring knowledge from simulations to the domain of real observations. This paper opens up the path to model calibration using multiple data types, which have traditionally been ignored in predictive models.

* 13 pages, 12 figures

Via

Access Paper or Ask Questions

On the Design of Deep Priors for Unsupervised Audio Restoration

Apr 14, 2021
Vivek Sivaraman Narayanaswamy, Jayaraman J. Thiagarajan, Andreas Spanias

Figure 1 for On the Design of Deep Priors for Unsupervised Audio Restoration

Figure 2 for On the Design of Deep Priors for Unsupervised Audio Restoration

Figure 3 for On the Design of Deep Priors for Unsupervised Audio Restoration

Figure 4 for On the Design of Deep Priors for Unsupervised Audio Restoration

Unsupervised deep learning methods for solving audio restoration problems extensively rely on carefully tailored neural architectures that carry strong inductive biases for defining priors in the time or spectral domain. In this context, lot of recent success has been achieved with sophisticated convolutional network constructions that recover audio signals in the spectral domain. However, in practice, audio priors require careful engineering of the convolutional kernels to be effective at solving ill-posed restoration tasks, while also being easy to train. To this end, in this paper, we propose a new U-Net based prior that does not impact either the network complexity or convergence behavior of existing convolutional architectures, yet leads to significantly improved restoration. In particular, we advocate the use of carefully designed dilation schedules and dense connections in the U-Net architecture to obtain powerful audio priors. Using empirical studies on standard benchmarks and a variety of ill-posed restoration tasks, such as audio denoising, in-painting and source separation, we demonstrate that our proposed approach consistently outperforms widely adopted audio prior architectures.

Via

Access Paper or Ask Questions