Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Andrew Geng

Model Reprogramming Outperforms Fine-tuning on Out-of-distribution Data in Text-Image Encoders

Mar 29, 2024

Andrew Geng, Pin-Yu Chen

Abstract:When evaluating the performance of a pre-trained model transferred to a downstream task, it is imperative to assess not only the in-distribution (ID) accuracy of the downstream model but also its capacity to generalize and identify out-of-distribution (OOD) samples. In this paper, we unveil the hidden costs associated with intrusive fine-tuning techniques. Specifically, we demonstrate that commonly used fine-tuning methods not only distort the representations necessary for generalizing to covariate-shifted OOD samples (OOD generalization) but also distort the representations necessary for detecting semantically-shifted OOD samples (OOD detection). To address these challenges, we introduce a new model reprogramming approach for fine-tuning, which we name Reprogrammer. Reprogrammer aims to improve the holistic performance of the downstream model across ID, OOD generalization, and OOD detection tasks. Our empirical evidence reveals that Reprogrammer is less intrusive and yields superior downstream models. Furthermore, we demonstrate that by appending an additional representation residual connection to Reprogrammer, we can further preserve pre-training representations, resulting in an even more safe and robust downstream model capable of excelling in many ID classification, OOD generalization, and OOD detection settings.

* Accepted in SatML 2024

Via

Access Paper or Ask Questions

On the Importance of Gradients for Detecting Distributional Shifts in the Wild

Oct 09, 2021

Rui Huang, Andrew Geng, Yixuan Li

Figure 1 for On the Importance of Gradients for Detecting Distributional Shifts in the Wild

Figure 2 for On the Importance of Gradients for Detecting Distributional Shifts in the Wild

Figure 3 for On the Importance of Gradients for Detecting Distributional Shifts in the Wild

Figure 4 for On the Importance of Gradients for Detecting Distributional Shifts in the Wild

Abstract:Detecting out-of-distribution (OOD) data has become a critical component in ensuring the safe deployment of machine learning models in the real world. Existing OOD detection approaches primarily rely on the output or feature space for deriving OOD scores, while largely overlooking information from the gradient space. In this paper, we present GradNorm, a simple and effective approach for detecting OOD inputs by utilizing information extracted from the gradient space. GradNorm directly employs the vector norm of gradients, backpropagated from the KL divergence between the softmax output and a uniform probability distribution. Our key idea is that the magnitude of gradients is higher for in-distribution (ID) data than that for OOD data, making it informative for OOD detection. GradNorm demonstrates superior performance, reducing the average FPR95 by up to 16.33% compared to the previous best method.

* Accepted in NeurIPS 2021

Via

Access Paper or Ask Questions