Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Kangwook Lee

Outlier-Robust Group Inference via Gradient Space Clustering

Oct 13, 2022

Yuchen Zeng, Kristjan Greenewald, Kangwook Lee, Justin Solomon, Mikhail Yurochkin

Figure 1 for Outlier-Robust Group Inference via Gradient Space Clustering

Figure 2 for Outlier-Robust Group Inference via Gradient Space Clustering

Figure 3 for Outlier-Robust Group Inference via Gradient Space Clustering

Figure 4 for Outlier-Robust Group Inference via Gradient Space Clustering

Abstract:Traditional machine learning models focus on achieving good performance on the overall training distribution, but they often underperform on minority groups. Existing methods can improve the worst-group performance, but they can have several limitations: (i) they require group annotations, which are often expensive and sometimes infeasible to obtain, and/or (ii) they are sensitive to outliers. Most related works fail to solve these two issues simultaneously as they focus on conflicting perspectives of minority groups and outliers. We address the problem of learning group annotations in the presence of outliers by clustering the data in the space of gradients of the model parameters. We show that data in the gradient space has a simpler structure while preserving information about minority groups and outliers, making it suitable for standard clustering methods like DBSCAN. Extensive experiments demonstrate that our method significantly outperforms state-of-the-art both in terms of group identification and downstream worst-group performance.

* 17 pages, 6 tables, 8 figures

Via

Access Paper or Ask Questions

Equal Improvability: A New Fairness Notion Considering the Long-term Impact

Oct 13, 2022

Ozgur Guldogan, Yuchen Zeng, Jy-yong Sohn, Ramtin Pedarsani, Kangwook Lee

Figure 1 for Equal Improvability: A New Fairness Notion Considering the Long-term Impact

Figure 2 for Equal Improvability: A New Fairness Notion Considering the Long-term Impact

Figure 3 for Equal Improvability: A New Fairness Notion Considering the Long-term Impact

Figure 4 for Equal Improvability: A New Fairness Notion Considering the Long-term Impact

Abstract:Devising a fair classifier that does not discriminate against different groups is an important problem in machine learning. Although researchers have proposed various ways of defining group fairness, most of them only focused on the immediate fairness, ignoring the long-term impact of a fair classifier under the dynamic scenario where each individual can improve its feature over time. Such dynamic scenarios happen in real world, e.g., college admission and credit loaning, where each rejected sample makes effort to change its features to get accepted afterwards. In this dynamic setting, the long-term fairness should equalize the samples' feature distribution across different groups after the rejected samples make some effort to improve. In order to promote long-term fairness, we propose a new fairness notion called Equal Improvability (EI), which equalizes the potential acceptance rate of the rejected samples across different groups assuming a bounded level of effort will be spent by each rejected sample. We analyze the properties of EI and its connections with existing fairness notions. To find a classifier that satisfies the EI requirement, we propose and study three different approaches that solve EI-regularized optimization problems. Through experiments on both synthetic and real datasets, we demonstrate that the proposed EI-regularized algorithms encourage us to find a fair classifier in terms of EI. Finally, we provide experimental results on dynamic scenarios which highlight the advantages of our EI metric in achieving the long-term fairness. Codes are available in a GitHub repository, see https://github.com/guldoganozgur/ei_fairness.

* Codes are available in a GitHub repository, see https://github.com/guldoganozgur/ei_fairness. 19 pages, 5 figures, 4 tables

Via

Access Paper or Ask Questions

A Better Way to Decay: Proximal Gradient Training Algorithms for Neural Nets

Oct 06, 2022

Liu Yang, Jifan Zhang, Joseph Shenouda, Dimitris Papailiopoulos, Kangwook Lee, Robert D. Nowak

Figure 1 for A Better Way to Decay: Proximal Gradient Training Algorithms for Neural Nets

Figure 2 for A Better Way to Decay: Proximal Gradient Training Algorithms for Neural Nets

Figure 3 for A Better Way to Decay: Proximal Gradient Training Algorithms for Neural Nets

Figure 4 for A Better Way to Decay: Proximal Gradient Training Algorithms for Neural Nets

Abstract:Weight decay is one of the most widely used forms of regularization in deep learning, and has been shown to improve generalization and robustness. The optimization objective driving weight decay is a sum of losses plus a term proportional to the sum of squared weights. This paper argues that stochastic gradient descent (SGD) may be an inefficient algorithm for this objective. For neural networks with ReLU activations, solutions to the weight decay objective are equivalent to those of a different objective in which the regularization term is instead a sum of products of $\ell_2$ (not squared) norms of the input and output weights associated each ReLU. This alternative (and effectively equivalent) regularization suggests a novel proximal gradient algorithm for network training. Theory and experiments support the new training approach, showing that it can converge much faster to the sparse solutions it shares with standard weight decay training.

Via

Access Paper or Ask Questions

LIFT: Language-Interfaced Fine-Tuning for Non-Language Machine Learning Tasks

Jun 15, 2022

Tuan Dinh, Yuchen Zeng, Ruisu Zhang, Ziqian Lin, Michael Gira, Shashank Rajput, Jy-yong Sohn, Dimitris Papailiopoulos, Kangwook Lee

Figure 1 for LIFT: Language-Interfaced Fine-Tuning for Non-Language Machine Learning Tasks

Figure 2 for LIFT: Language-Interfaced Fine-Tuning for Non-Language Machine Learning Tasks

Figure 3 for LIFT: Language-Interfaced Fine-Tuning for Non-Language Machine Learning Tasks

Figure 4 for LIFT: Language-Interfaced Fine-Tuning for Non-Language Machine Learning Tasks

Abstract:Fine-tuning pretrained language models (LMs) without making any architectural changes has become a norm for learning various language downstream tasks. However, for non-language downstream tasks, a common practice is to employ task-specific designs for input, output layers, and loss functions. For instance, it is possible to fine-tune an LM into an MNIST classifier by replacing the word embedding layer with an image patch embedding layer, the word token output layer with a 10-way output layer, and the word prediction loss with a 10-way classification loss, respectively. A natural question arises: can LM fine-tuning solve non-language downstream tasks without changing the model architecture or loss function? To answer this, we propose Language-Interfaced Fine-Tuning (LIFT) and study its efficacy and limitations by conducting an extensive empirical study on a suite of non-language classification and regression tasks. LIFT does not make any changes to the model architecture or loss function, and it solely relies on the natural language interface, enabling "no-code machine learning with LMs." We find that LIFT performs relatively well across a wide range of low-dimensional classification and regression tasks, matching the performances of the best baselines in many cases, especially for the classification tasks. We report the experimental results on the fundamental properties of LIFT, including its inductive bias, sample efficiency, ability to extrapolate, robustness to outliers and label noise, and generalization. We also analyze a few properties/techniques specific to LIFT, e.g., context-aware learning via appropriate prompting, quantification of predictive uncertainty, and two-stage fine-tuning. Our code is available at https://github.com/UW-Madison-Lee-Lab/LanguageInterfacedFineTuning.

Via

Access Paper or Ask Questions

Utilizing Language-Image Pretraining for Efficient and Robust Bilingual Word Alignment

May 23, 2022

Tuan Dinh, Jy-yong Sohn, Shashank Rajput, Timothy Ossowski, Yifei Ming, Junjie Hu, Dimitris Papailiopoulos, Kangwook Lee

Figure 1 for Utilizing Language-Image Pretraining for Efficient and Robust Bilingual Word Alignment

Figure 2 for Utilizing Language-Image Pretraining for Efficient and Robust Bilingual Word Alignment

Figure 3 for Utilizing Language-Image Pretraining for Efficient and Robust Bilingual Word Alignment

Figure 4 for Utilizing Language-Image Pretraining for Efficient and Robust Bilingual Word Alignment

Abstract:Word translation without parallel corpora has become feasible, rivaling the performance of supervised methods. Recent findings have shown that the accuracy and robustness of unsupervised word translation (UWT) can be improved by making use of visual observations, which are universal representations across languages. In this work, we investigate the potential of using not only visual observations but also pretrained language-image models for enabling a more efficient and robust UWT. Specifically, we develop a novel UWT method dubbed Word Alignment using Language-Image Pretraining (WALIP), which leverages visual observations via the shared embedding space of images and texts provided by CLIP models (Radford et al., 2021). WALIP has a two-step procedure. First, we retrieve word pairs with high confidences of similarity, computed using our proposed image-based fingerprints, which define the initial pivot for the word alignment. Second, we apply our robust Procrustes algorithm to estimate the linear mapping between two embedding spaces, which iteratively corrects and refines the estimated alignment. Our extensive experiments show that WALIP improves upon the state-of-the-art performance of bilingual word alignment for a few language pairs across different word embeddings and displays great robustness to the dissimilarity of language pairs or training corpora for two word embeddings.

* 13 pages, 7 figures, 3 tables

Via

Access Paper or Ask Questions

Breaking Fair Binary Classification with Optimal Flipping Attacks

Apr 12, 2022

Changhun Jo, Jy-yong Sohn, Kangwook Lee

Figure 1 for Breaking Fair Binary Classification with Optimal Flipping Attacks

Figure 2 for Breaking Fair Binary Classification with Optimal Flipping Attacks

Figure 3 for Breaking Fair Binary Classification with Optimal Flipping Attacks

Figure 4 for Breaking Fair Binary Classification with Optimal Flipping Attacks

Abstract:Minimizing risk with fairness constraints is one of the popular approaches to learning a fair classifier. Recent works showed that this approach yields an unfair classifier if the training set is corrupted. In this work, we study the minimum amount of data corruption required for a successful flipping attack. First, we find lower/upper bounds on this quantity and show that these bounds are tight when the target model is the unique unconstrained risk minimizer. Second, we propose a computationally efficient data poisoning attack algorithm that can compromise the performance of fair learning algorithms.

Via

Access Paper or Ask Questions

Rare Gems: Finding Lottery Tickets at Initialization

Feb 24, 2022

Kartik Sreenivasan, Jy-yong Sohn, Liu Yang, Matthew Grinde, Alliot Nagle, Hongyi Wang, Kangwook Lee, Dimitris Papailiopoulos

Figure 1 for Rare Gems: Finding Lottery Tickets at Initialization

Figure 2 for Rare Gems: Finding Lottery Tickets at Initialization

Figure 3 for Rare Gems: Finding Lottery Tickets at Initialization

Figure 4 for Rare Gems: Finding Lottery Tickets at Initialization

Abstract:It has been widely observed that large neural networks can be pruned to a small fraction of their original size, with little loss in accuracy, by typically following a time-consuming "train, prune, re-train" approach. Frankle & Carbin (2018) conjecture that we can avoid this by training lottery tickets, i.e., special sparse subnetworks found at initialization, that can be trained to high accuracy. However, a subsequent line of work presents concrete evidence that current algorithms for finding trainable networks at initialization, fail simple baseline comparisons, e.g., against training random sparse subnetworks. Finding lottery tickets that train to better accuracy compared to simple baselines remains an open problem. In this work, we partially resolve this open problem by discovering rare gems: subnetworks at initialization that attain considerable accuracy, even before training. Refining these rare gems - "by means of fine-tuning" - beats current baselines and leads to accuracy competitive or better than magnitude pruning methods.

Via

Access Paper or Ask Questions

Improved Input Reprogramming for GAN Conditioning

Feb 07, 2022

Tuan Dinh, Daewon Seo, Zhixu Du, Liang Shang, Kangwook Lee

Figure 1 for Improved Input Reprogramming for GAN Conditioning

Figure 2 for Improved Input Reprogramming for GAN Conditioning

Figure 3 for Improved Input Reprogramming for GAN Conditioning

Figure 4 for Improved Input Reprogramming for GAN Conditioning

Abstract:We study the GAN conditioning problem, whose goal is to convert a pretrained unconditional GAN into a conditional GAN using labeled data. We first identify and analyze three approaches to this problem -- conditional GAN training from scratch, fine-tuning, and input reprogramming. Our analysis reveals that when the amount of labeled data is small, input reprogramming performs the best. Motivated by real-world scenarios with scarce labeled data, we focus on the input reprogramming approach and carefully analyze the existing algorithm. After identifying a few critical issues of the previous input reprogramming approach, we propose a new algorithm called InRep+. Our algorithm InRep+ addresses the existing issues with the novel uses of invertible neural networks and Positive-Unlabeled (PU) learning. Via extensive experiments, we show that InRep+ outperforms all existing methods, particularly when label information is scarce, noisy, and/or imbalanced. For instance, for the task of conditioning a CIFAR10 GAN with 1% labeled data, InRep+ achieves an average Intra-FID of 76.24, whereas the second-best method achieves 114.51.

* 24 pages, 7 figures

Via

Access Paper or Ask Questions

GenLabel: Mixup Relabeling using Generative Models

Jan 07, 2022

Jy-yong Sohn, Liang Shang, Hongxu Chen, Jaekyun Moon, Dimitris Papailiopoulos, Kangwook Lee

Figure 1 for GenLabel: Mixup Relabeling using Generative Models

Figure 2 for GenLabel: Mixup Relabeling using Generative Models

Figure 3 for GenLabel: Mixup Relabeling using Generative Models

Figure 4 for GenLabel: Mixup Relabeling using Generative Models

Abstract:Mixup is a data augmentation method that generates new data points by mixing a pair of input data. While mixup generally improves the prediction performance, it sometimes degrades the performance. In this paper, we first identify the main causes of this phenomenon by theoretically and empirically analyzing the mixup algorithm. To resolve this, we propose GenLabel, a simple yet effective relabeling algorithm designed for mixup. In particular, GenLabel helps the mixup algorithm correctly label mixup samples by learning the class-conditional data distribution using generative models. Via extensive theoretical and empirical analysis, we show that mixup, when used together with GenLabel, can effectively resolve the aforementioned phenomenon, improving the generalization performance and the adversarial robustness.

Via

Access Paper or Ask Questions

Augment & Valuate : A Data Enhancement Pipeline for Data-Centric AI

Dec 07, 2021

Youngjune Lee, Oh Joon Kwon, Haeju Lee, Joonyoung Kim, Kangwook Lee, Kee-Eung Kim

Figure 1 for Augment & Valuate : A Data Enhancement Pipeline for Data-Centric AI

Figure 2 for Augment & Valuate : A Data Enhancement Pipeline for Data-Centric AI

Figure 3 for Augment & Valuate : A Data Enhancement Pipeline for Data-Centric AI

Figure 4 for Augment & Valuate : A Data Enhancement Pipeline for Data-Centric AI

Abstract:Data scarcity and noise are important issues in industrial applications of machine learning. However, it is often challenging to devise a scalable and generalized approach to address the fundamental distributional and semantic properties of dataset with black box models. For this reason, data-centric approaches are crucial for the automation of machine learning operation pipeline. In order to serve as the basis for this automation, we suggest a domain-agnostic pipeline for refining the quality of data in image classification problems. This pipeline contains data valuation, cleansing, and augmentation. With an appropriate combination of these methods, we could achieve 84.711% test accuracy (ranked #6, Honorable Mention in the Most Innovative) in the Data-Centric AI competition only with the provided dataset.

* Data Centric AI Workshop at NeurIPS 2021

Via

Access Paper or Ask Questions