Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Aleksander Madry

Tony

Data Debiasing with Datamodels (D3M): Improving Subgroup Robustness via Data Selection

Jun 24, 2024

Saachi Jain, Kimia Hamidieh, Kristian Georgiev, Andrew Ilyas, Marzyeh Ghassemi, Aleksander Madry

Abstract:Machine learning models can fail on subgroups that are underrepresented during training. While techniques such as dataset balancing can improve performance on underperforming groups, they require access to training group annotations and can end up removing large portions of the dataset. In this paper, we introduce Data Debiasing with Datamodels (D3M), a debiasing approach which isolates and removes specific training examples that drive the model's failures on minority groups. Our approach enables us to efficiently train debiased classifiers while removing only a small number of examples, and does not require training group annotations or additional hyperparameter tuning.

Via

Access Paper or Ask Questions

Measuring Strategization in Recommendation: Users Adapt Their Behavior to Shape Future Content

May 09, 2024

Sarah H. Cen, Andrew Ilyas, Jennifer Allen, Hannah Li, Aleksander Madry

Figure 1 for Measuring Strategization in Recommendation: Users Adapt Their Behavior to Shape Future Content

Figure 2 for Measuring Strategization in Recommendation: Users Adapt Their Behavior to Shape Future Content

Figure 3 for Measuring Strategization in Recommendation: Users Adapt Their Behavior to Shape Future Content

Figure 4 for Measuring Strategization in Recommendation: Users Adapt Their Behavior to Shape Future Content

Abstract:Most modern recommendation algorithms are data-driven: they generate personalized recommendations by observing users' past behaviors. A common assumption in recommendation is that how a user interacts with a piece of content (e.g., whether they choose to "like" it) is a reflection of the content, but not of the algorithm that generated it. Although this assumption is convenient, it fails to capture user strategization: that users may attempt to shape their future recommendations by adapting their behavior to the recommendation algorithm. In this work, we test for user strategization by conducting a lab experiment and survey. To capture strategization, we adopt a model in which strategic users select their engagement behavior based not only on the content, but also on how their behavior affects downstream recommendations. Using a custom music player that we built, we study how users respond to different information about their recommendation algorithm as well as to different incentives about how their actions affect downstream outcomes. We find strong evidence of strategization across outcome metrics, including participants' dwell time and use of "likes." For example, participants who are told that the algorithm mainly pays attention to "likes" and "dislikes" use those functions 1.9x more than participants told that the algorithm mainly pays attention to dwell time. A close analysis of participant behavior (e.g., in response to our incentive conditions) rules out experimenter demand as the main driver of these trends. Further, in our post-experiment survey, nearly half of participants self-report strategizing "in the wild," with some stating that they ignore content they actually like to avoid over-recommendation of that content in the future. Together, our findings suggest that user strategization is common and that platforms cannot ignore the effect of their algorithms on user behavior.

Via

Access Paper or Ask Questions

Decomposing and Editing Predictions by Modeling Model Computation

Apr 17, 2024

Harshay Shah, Andrew Ilyas, Aleksander Madry

Figure 1 for Decomposing and Editing Predictions by Modeling Model Computation

Figure 2 for Decomposing and Editing Predictions by Modeling Model Computation

Figure 3 for Decomposing and Editing Predictions by Modeling Model Computation

Figure 4 for Decomposing and Editing Predictions by Modeling Model Computation

Abstract:How does the internal computation of a machine learning model transform inputs into predictions? In this paper, we introduce a task called component modeling that aims to address this question. The goal of component modeling is to decompose an ML model's prediction in terms of its components -- simple functions (e.g., convolution filters, attention heads) that are the "building blocks" of model computation. We focus on a special case of this task, component attribution, where the goal is to estimate the counterfactual impact of individual components on a given prediction. We then present COAR, a scalable algorithm for estimating component attributions; we demonstrate its effectiveness across models, datasets, and modalities. Finally, we show that component attributions estimated with COAR directly enable model editing across five tasks, namely: fixing model errors, ``forgetting'' specific classes, boosting subpopulation robustness, localizing backdoor attacks, and improving robustness to typographic attacks. We provide code for COAR at https://github.com/MadryLab/modelcomponents .

Via

Access Paper or Ask Questions

Ask Your Distribution Shift if Pre-Training is Right for You

Feb 29, 2024

Benjamin Cohen-Wang, Joshua Vendrow, Aleksander Madry

Figure 1 for Ask Your Distribution Shift if Pre-Training is Right for You

Figure 2 for Ask Your Distribution Shift if Pre-Training is Right for You

Figure 3 for Ask Your Distribution Shift if Pre-Training is Right for You

Figure 4 for Ask Your Distribution Shift if Pre-Training is Right for You

Abstract:Pre-training is a widely used approach to develop models that are robust to distribution shifts. However, in practice, its effectiveness varies: fine-tuning a pre-trained model improves robustness significantly in some cases but not at all in others (compared to training from scratch). In this work, we seek to characterize the failure modes that pre-training can and cannot address. In particular, we focus on two possible failure modes of models under distribution shift: poor extrapolation (e.g., they cannot generalize to a different domain) and biases in the training data (e.g., they rely on spurious features). Our study suggests that, as a rule of thumb, pre-training can help mitigate poor extrapolation but not dataset biases. After providing theoretical motivation and empirical evidence for this finding, we explore two of its implications for developing robust models: (1) pre-training and interventions designed to prevent exploiting biases have complementary robustness benefits, and (2) fine-tuning on a (very) small, non-diverse but de-biased dataset can result in significantly more robust models than fine-tuning on a large and diverse but biased dataset. Code is available at https://github.com/MadryLab/pretraining-distribution-shift-robustness.

Via

Access Paper or Ask Questions

DsDm: Model-Aware Dataset Selection with Datamodels

Jan 23, 2024

Logan Engstrom, Axel Feldmann, Aleksander Madry

Figure 1 for DsDm: Model-Aware Dataset Selection with Datamodels

Figure 2 for DsDm: Model-Aware Dataset Selection with Datamodels

Figure 3 for DsDm: Model-Aware Dataset Selection with Datamodels

Figure 4 for DsDm: Model-Aware Dataset Selection with Datamodels

Abstract:When selecting data for training large-scale models, standard practice is to filter for examples that match human notions of data quality. Such filtering yields qualitatively clean datapoints that intuitively should improve model behavior. However, in practice the opposite can often happen: we find that selecting according to similarity with "high quality" data sources may not increase (and can even hurt) performance compared to randomly selecting data. To develop better methods for selecting data, we start by framing dataset selection as an optimization problem that we can directly solve for: given target tasks, a learning algorithm, and candidate data, select the subset that maximizes model performance. This framework thus avoids handpicked notions of data quality, and instead models explicitly how the learning process uses train datapoints to predict on the target tasks. Our resulting method greatly improves language model (LM) performance on both pre-specified tasks and previously unseen tasks. Specifically, choosing target tasks representative of standard LM problems and evaluating on diverse held-out benchmarks, our selected datasets provide a 2x compute multiplier over baseline methods.

Via

Access Paper or Ask Questions

User Strategization and Trustworthy Algorithms

Dec 29, 2023

Sarah H. Cen, Andrew Ilyas, Aleksander Madry

Abstract:Many human-facing algorithms -- including those that power recommender systems or hiring decision tools -- are trained on data provided by their users. The developers of these algorithms commonly adopt the assumption that the data generating process is exogenous: that is, how a user reacts to a given prompt (e.g., a recommendation or hiring suggestion) depends on the prompt and not on the algorithm that generated it. For example, the assumption that a person's behavior follows a ground-truth distribution is an exogeneity assumption. In practice, when algorithms interact with humans, this assumption rarely holds because users can be strategic. Recent studies document, for example, TikTok users changing their scrolling behavior after learning that TikTok uses it to curate their feed, and Uber drivers changing how they accept and cancel rides in response to changes in Uber's algorithm. Our work studies the implications of this strategic behavior by modeling the interactions between a user and their data-driven platform as a repeated, two-player game. We first find that user strategization can actually help platforms in the short term. We then show that it corrupts platforms' data and ultimately hurts their ability to make counterfactual decisions. We connect this phenomenon to user trust, and show that designing trustworthy algorithms can go hand in hand with accurate estimation. Finally, we provide a formalization of trustworthiness that inspires potential interventions.

Via

Access Paper or Ask Questions

The Journey, Not the Destination: How Data Guides Diffusion Models

Dec 11, 2023

Kristian Georgiev, Joshua Vendrow, Hadi Salman, Sung Min Park, Aleksander Madry

Figure 1 for The Journey, Not the Destination: How Data Guides Diffusion Models

Figure 2 for The Journey, Not the Destination: How Data Guides Diffusion Models

Figure 3 for The Journey, Not the Destination: How Data Guides Diffusion Models

Figure 4 for The Journey, Not the Destination: How Data Guides Diffusion Models

Abstract:Diffusion models trained on large datasets can synthesize photo-realistic images of remarkable quality and diversity. However, attributing these images back to the training data-that is, identifying specific training examples which caused an image to be generated-remains a challenge. In this paper, we propose a framework that: (i) provides a formal notion of data attribution in the context of diffusion models, and (ii) allows us to counterfactually validate such attributions. Then, we provide a method for computing these attributions efficiently. Finally, we apply our method to find (and evaluate) such attributions for denoising diffusion probabilistic models trained on CIFAR-10 and latent diffusion models trained on MS COCO. We provide code at https://github.com/MadryLab/journey-TRAK .

* 29 pages, 17 figures

Via

Access Paper or Ask Questions

Rethinking Backdoor Attacks

Jul 19, 2023

Alaa Khaddaj, Guillaume Leclerc, Aleksandar Makelov, Kristian Georgiev, Hadi Salman, Andrew Ilyas, Aleksander Madry

Figure 1 for Rethinking Backdoor Attacks

Figure 2 for Rethinking Backdoor Attacks

Figure 3 for Rethinking Backdoor Attacks

Figure 4 for Rethinking Backdoor Attacks

Abstract:In a backdoor attack, an adversary inserts maliciously constructed backdoor examples into a training set to make the resulting model vulnerable to manipulation. Defending against such attacks typically involves viewing these inserted examples as outliers in the training set and using techniques from robust statistics to detect and remove them. In this work, we present a different approach to the backdoor attack problem. Specifically, we show that without structural information about the training data distribution, backdoor attacks are indistinguishable from naturally-occurring features in the data--and thus impossible to "detect" in a general sense. Then, guided by this observation, we revisit existing defenses against backdoor attacks and characterize the (often latent) assumptions they make and on which they depend. Finally, we explore an alternative perspective on backdoor attacks: one that assumes these attacks correspond to the strongest feature in the training data. Under this assumption (which we make formal) we develop a new primitive for detecting backdoor attacks. Our primitive naturally gives rise to a detection algorithm that comes with theoretical guarantees and is effective in practice.

* ICML 2023

Via

Access Paper or Ask Questions

FFCV: Accelerating Training by Removing Data Bottlenecks

Jun 21, 2023

Guillaume Leclerc, Andrew Ilyas, Logan Engstrom, Sung Min Park, Hadi Salman, Aleksander Madry

Figure 1 for FFCV: Accelerating Training by Removing Data Bottlenecks

Figure 2 for FFCV: Accelerating Training by Removing Data Bottlenecks

Figure 3 for FFCV: Accelerating Training by Removing Data Bottlenecks

Figure 4 for FFCV: Accelerating Training by Removing Data Bottlenecks

Abstract:We present FFCV, a library for easy and fast machine learning model training. FFCV speeds up model training by eliminating (often subtle) data bottlenecks from the training process. In particular, we combine techniques such as an efficient file storage format, caching, data pre-loading, asynchronous data transfer, and just-in-time compilation to (a) make data loading and transfer significantly more efficient, ensuring that GPUs can reach full utilization; and (b) offload as much data processing as possible to the CPU asynchronously, freeing GPU cycles for training. Using FFCV, we train ResNet-18 and ResNet-50 on the ImageNet dataset with competitive tradeoff between accuracy and training time. For example, we are able to train an ImageNet ResNet-50 model to 75\% in only 20 mins on a single machine. We demonstrate FFCV's performance, ease-of-use, extensibility, and ability to adapt to resource constraints through several case studies. Detailed installation instructions, documentation, and Slack support channel are available at https://ffcv.io/ .

Via

Access Paper or Ask Questions

A User-Driven Framework for Regulating and Auditing Social Media

Apr 20, 2023

Sarah H. Cen, Aleksander Madry, Devavrat Shah

Figure 1 for A User-Driven Framework for Regulating and Auditing Social Media

Figure 2 for A User-Driven Framework for Regulating and Auditing Social Media

Figure 3 for A User-Driven Framework for Regulating and Auditing Social Media

Figure 4 for A User-Driven Framework for Regulating and Auditing Social Media

Abstract:People form judgments and make decisions based on the information that they observe. A growing portion of that information is not only provided, but carefully curated by social media platforms. Although lawmakers largely agree that platforms should not operate without any oversight, there is little consensus on how to regulate social media. There is consensus, however, that creating a strict, global standard of "acceptable" content is untenable (e.g., in the US, it is incompatible with Section 230 of the Communications Decency Act and the First Amendment). In this work, we propose that algorithmic filtering should be regulated with respect to a flexible, user-driven baseline. We provide a concrete framework for regulating and auditing a social media platform according to such a baseline. In particular, we introduce the notion of a baseline feed: the content that a user would see without filtering (e.g., on Twitter, this could be the chronological timeline). We require that the feeds a platform filters contain "similar" informational content as their respective baseline feeds, and we design a principled way to measure similarity. This approach is motivated by related suggestions that regulations should increase user agency. We present an auditing procedure that checks whether a platform honors this requirement. Notably, the audit needs only black-box access to a platform's filtering algorithm, and it does not access or infer private user information. We provide theoretical guarantees on the strength of the audit. We further show that requiring closeness between filtered and baseline feeds does not impose a large performance cost, nor does it create echo chambers.

* 21 pages, 4 figures

Via

Access Paper or Ask Questions