Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Tom Goldstein

GradInit: Learning to Initialize Neural Networks for Stable and Efficient Training

Feb 16, 2021
Chen Zhu, Renkun Ni, Zheng Xu, Kezhi Kong, W. Ronny Huang, Tom Goldstein

Figure 1 for GradInit: Learning to Initialize Neural Networks for Stable and Efficient Training

Figure 2 for GradInit: Learning to Initialize Neural Networks for Stable and Efficient Training

Figure 3 for GradInit: Learning to Initialize Neural Networks for Stable and Efficient Training

Figure 4 for GradInit: Learning to Initialize Neural Networks for Stable and Efficient Training

Changes in neural architectures have fostered significant breakthroughs in language modeling and computer vision. Unfortunately, novel architectures often require re-thinking the choice of hyperparameters (e.g., learning rate, warmup schedule, and momentum coefficients) to maintain stability of the optimizer. This optimizer instability is often the result of poor parameter initialization, and can be avoided by architecture-specific initialization schemes. In this paper, we present GradInit, an automated and architecture agnostic method for initializing neural networks. GradInit is based on a simple heuristic; the variance of each network layer is adjusted so that a single step of SGD or Adam results in the smallest possible loss value. This adjustment is done by introducing a scalar multiplier variable in front of each parameter block, and then optimizing these variables using a simple numerical scheme. GradInit accelerates the convergence and test performance of many convolutional architectures, both with or without skip connections, and even without normalization layers. It also enables training the original Post-LN Transformer for machine translation without learning rate warmup under a wide range of learning rates and momentum coefficients. Code is available at https://github.com/zhuchen03/gradinit.

Via

Access Paper or Ask Questions

Technical Challenges for Training Fair Neural Networks

Feb 12, 2021
Valeriia Cherepanova, Vedant Nanda, Micah Goldblum, John P. Dickerson, Tom Goldstein

Figure 1 for Technical Challenges for Training Fair Neural Networks

Figure 2 for Technical Challenges for Training Fair Neural Networks

Figure 3 for Technical Challenges for Training Fair Neural Networks

Figure 4 for Technical Challenges for Training Fair Neural Networks

As machine learning algorithms have been widely deployed across applications, many concerns have been raised over the fairness of their predictions, especially in high stakes settings (such as facial recognition and medical imaging). To respond to these concerns, the community has proposed and formalized various notions of fairness as well as methods for rectifying unfair behavior. While fairness constraints have been studied extensively for classical models, the effectiveness of methods for imposing fairness on deep neural networks is unclear. In this paper, we observe that these large models overfit to fairness objectives, and produce a range of unintended and undesirable consequences. We conduct our experiments on both facial recognition and automated medical diagnosis datasets using state-of-the-art architectures.

Via

Access Paper or Ask Questions

LowKey: Leveraging Adversarial Attacks to Protect Social Media Users from Facial Recognition

Jan 25, 2021
Valeriia Cherepanova, Micah Goldblum, Harrison Foley, Shiyuan Duan, John Dickerson, Gavin Taylor, Tom Goldstein

Figure 1 for LowKey: Leveraging Adversarial Attacks to Protect Social Media Users from Facial Recognition

Figure 2 for LowKey: Leveraging Adversarial Attacks to Protect Social Media Users from Facial Recognition

Figure 3 for LowKey: Leveraging Adversarial Attacks to Protect Social Media Users from Facial Recognition

Figure 4 for LowKey: Leveraging Adversarial Attacks to Protect Social Media Users from Facial Recognition

Facial recognition systems are increasingly deployed by private corporations, government agencies, and contractors for consumer services and mass surveillance programs alike. These systems are typically built by scraping social media profiles for user images. Adversarial perturbations have been proposed for bypassing facial recognition systems. However, existing methods fail on full-scale systems and commercial APIs. We develop our own adversarial filter that accounts for the entire image processing pipeline and is demonstrably effective against industrial-grade pipelines that include face detection and large scale databases. Additionally, we release an easy-to-use webtool that significantly degrades the accuracy of Amazon Rekognition and the Microsoft Azure Face Recognition API, reducing the accuracy of each to below 1%.

* Published as a conference paper at ICLR 2021

Via

Access Paper or Ask Questions

Dataset Security for Machine Learning: Data Poisoning, Backdoor Attacks, and Defenses

Dec 30, 2020
Micah Goldblum, Dimitris Tsipras, Chulin Xie, Xinyun Chen, Avi Schwarzschild, Dawn Song, Aleksander Madry, Bo Li, Tom Goldstein

Figure 1 for Dataset Security for Machine Learning: Data Poisoning, Backdoor Attacks, and Defenses

Figure 2 for Dataset Security for Machine Learning: Data Poisoning, Backdoor Attacks, and Defenses

Figure 3 for Dataset Security for Machine Learning: Data Poisoning, Backdoor Attacks, and Defenses

Figure 4 for Dataset Security for Machine Learning: Data Poisoning, Backdoor Attacks, and Defenses

As machine learning systems grow in scale, so do their training data requirements, forcing practitioners to automate and outsource the curation of training data in order to achieve state-of-the-art performance. The absence of trustworthy human supervision over the data collection process exposes organizations to security vulnerabilities; training data can be manipulated to control and degrade the downstream behaviors of learned models. The goal of this work is to systematically categorize and discuss a wide range of dataset vulnerabilities and exploits, approaches for defending against these threats, and an array of open problems in this space. In addition to describing various poisoning and backdoor threat models and the relationships among them, we develop their unified taxonomy.

Via

Access Paper or Ask Questions

Data Security for Machine Learning: Data Poisoning, Backdoor Attacks, and Defenses

Dec 18, 2020
Micah Goldblum, Dimitris Tsipras, Chulin Xie, Xinyun Chen, Avi Schwarzschild, Dawn Song, Aleksander Madry, Bo Li, Tom Goldstein

Figure 1 for Data Security for Machine Learning: Data Poisoning, Backdoor Attacks, and Defenses

Figure 2 for Data Security for Machine Learning: Data Poisoning, Backdoor Attacks, and Defenses

Figure 3 for Data Security for Machine Learning: Data Poisoning, Backdoor Attacks, and Defenses

Figure 4 for Data Security for Machine Learning: Data Poisoning, Backdoor Attacks, and Defenses

As machine learning systems consume more and more data, practitioners are increasingly forced to automate and outsource the curation of training data in order to meet their data demands. This absence of human supervision over the data collection process exposes organizations to security vulnerabilities: malicious agents can insert poisoned examples into the training set to exploit the machine learning systems that are trained on it. Motivated by the emergence of this paradigm, there has been a surge in work on data poisoning including a variety of threat models as well as attack and defense methods. The goal of this work is to systematically categorize and discuss a wide range of data poisoning and backdoor attacks, approaches to defending against these threats, and an array of open problems in this space. In addition to describing these methods and the relationships among them in detail, we develop their unified taxonomy.

Via

Access Paper or Ask Questions

Analyzing the Machine Learning Conference Review Process

Nov 26, 2020
David Tran, Alex Valtchanov, Keshav Ganapathy, Raymond Feng, Eric Slud, Micah Goldblum, Tom Goldstein

Figure 1 for Analyzing the Machine Learning Conference Review Process

Figure 2 for Analyzing the Machine Learning Conference Review Process

Figure 3 for Analyzing the Machine Learning Conference Review Process

Figure 4 for Analyzing the Machine Learning Conference Review Process

Mainstream machine learning conferences have seen a dramatic increase in the number of participants, along with a growing range of perspectives, in recent years. Members of the machine learning community are likely to overhear allegations ranging from randomness of acceptance decisions to institutional bias. In this work, we critically analyze the review process through a comprehensive study of papers submitted to ICLR between 2017 and 2020. We quantify reproducibility/randomness in review scores and acceptance decisions, and examine whether scores correlate with paper impact. Our findings suggest strong institutional bias in accept/reject decisions, even after controlling for paper quality. Furthermore, we find evidence for a gender gap, with female authors receiving lower scores, lower acceptance rates, and fewer citations per paper than their male counterparts. We conclude our work with recommendations for future conference organizers.

* NeurIPS Workshop on Navigating the Broader Impacts of AI Research. Full version at arXiv:2010.05137

Via

Access Paper or Ask Questions

Strong Data Augmentation Sanitizes Poisoning and Backdoor Attacks Without an Accuracy Tradeoff

Nov 18, 2020
Eitan Borgnia, Valeriia Cherepanova, Liam Fowl, Amin Ghiasi, Jonas Geiping, Micah Goldblum, Tom Goldstein, Arjun Gupta

Figure 1 for Strong Data Augmentation Sanitizes Poisoning and Backdoor Attacks Without an Accuracy Tradeoff

Figure 2 for Strong Data Augmentation Sanitizes Poisoning and Backdoor Attacks Without an Accuracy Tradeoff

Figure 3 for Strong Data Augmentation Sanitizes Poisoning and Backdoor Attacks Without an Accuracy Tradeoff

Figure 4 for Strong Data Augmentation Sanitizes Poisoning and Backdoor Attacks Without an Accuracy Tradeoff

Data poisoning and backdoor attacks manipulate victim models by maliciously modifying training data. In light of this growing threat, a recent survey of industry professionals revealed heightened fear in the private sector regarding data poisoning. Many previous defenses against poisoning either fail in the face of increasingly strong attacks, or they significantly degrade performance. However, we find that strong data augmentations, such as mixup and CutMix, can significantly diminish the threat of poisoning and backdoor attacks without trading off performance. We further verify the effectiveness of this simple defense against adaptive poisoning methods, and we compare to baselines including the popular differentially private SGD (DP-SGD) defense. In the context of backdoors, CutMix greatly mitigates the attack while simultaneously increasing validation accuracy by 9%.

* Authors ordered alphabetically

Via

Access Paper or Ask Questions

An Open Review of OpenReview: A Critical Analysis of the Machine Learning Conference Review Process

Oct 26, 2020
David Tran, Alex Valtchanov, Keshav Ganapathy, Raymond Feng, Eric Slud, Micah Goldblum, Tom Goldstein

Figure 1 for An Open Review of OpenReview: A Critical Analysis of the Machine Learning Conference Review Process

Figure 2 for An Open Review of OpenReview: A Critical Analysis of the Machine Learning Conference Review Process

Figure 3 for An Open Review of OpenReview: A Critical Analysis of the Machine Learning Conference Review Process

Figure 4 for An Open Review of OpenReview: A Critical Analysis of the Machine Learning Conference Review Process

* 19 pages, 6 Figures

Via

Access Paper or Ask Questions

Are Adversarial Examples Created Equal? A Learnable Weighted Minimax Risk for Robustness under Non-uniform Attacks

Oct 24, 2020
Huimin Zeng, Chen Zhu, Tom Goldstein, Furong Huang

Figure 1 for Are Adversarial Examples Created Equal? A Learnable Weighted Minimax Risk for Robustness under Non-uniform Attacks

Figure 2 for Are Adversarial Examples Created Equal? A Learnable Weighted Minimax Risk for Robustness under Non-uniform Attacks

Figure 3 for Are Adversarial Examples Created Equal? A Learnable Weighted Minimax Risk for Robustness under Non-uniform Attacks

Figure 4 for Are Adversarial Examples Created Equal? A Learnable Weighted Minimax Risk for Robustness under Non-uniform Attacks

Adversarial Training is proved to be an efficient method to defend against adversarial examples, being one of the few defenses that withstand strong attacks. However, traditional defense mechanisms assume a uniform attack over the examples according to the underlying data distribution, which is apparently unrealistic as the attacker could choose to focus on more vulnerable examples. We present a weighted minimax risk optimization that defends against non-uniform attacks, achieving robustness against adversarial examples under perturbed test data distributions. Our modified risk considers importance weights of different adversarial examples and focuses adaptively on harder examples that are wrongly classified or at higher risk of being classified incorrectly. The designed risk allows the training process to learn a strong defense through optimizing the importance weights. The experiments show that our model significantly improves state-of-the-art adversarial accuracy under non-uniform attacks without a significant drop under uniform attacks.

Via

Access Paper or Ask Questions