Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Dawn Song

Is Vertical Logistic Regression Privacy-Preserving? A Comprehensive Privacy Analysis and Beyond

Jul 19, 2022
Yuzheng Hu, Tianle Cai, Jinyong Shan, Shange Tang, Chaochao Cai, Ethan Song, Bo Li, Dawn Song

Figure 1 for Is Vertical Logistic Regression Privacy-Preserving? A Comprehensive Privacy Analysis and Beyond

Figure 2 for Is Vertical Logistic Regression Privacy-Preserving? A Comprehensive Privacy Analysis and Beyond

Figure 3 for Is Vertical Logistic Regression Privacy-Preserving? A Comprehensive Privacy Analysis and Beyond

Figure 4 for Is Vertical Logistic Regression Privacy-Preserving? A Comprehensive Privacy Analysis and Beyond

We consider vertical logistic regression (VLR) trained with mini-batch gradient descent -- a setting which has attracted growing interest among industries and proven to be useful in a wide range of applications including finance and medical research. We provide a comprehensive and rigorous privacy analysis of VLR in a class of open-source Federated Learning frameworks, where the protocols might differ between one another, yet a procedure of obtaining local gradients is implicitly shared. We first consider the honest-but-curious threat model, in which the detailed implementation of protocol is neglected and only the shared procedure is assumed, which we abstract as an oracle. We find that even under this general setting, single-dimension feature and label can still be recovered from the other party under suitable constraints of batch size, thus demonstrating the potential vulnerability of all frameworks following the same philosophy. Then we look into a popular instantiation of the protocol based on Homomorphic Encryption (HE). We propose an active attack that significantly weaken the constraints on batch size in the previous analysis via generating and compressing auxiliary ciphertext. To address the privacy leakage within the HE-based protocol, we develop a simple-yet-effective countermeasure based on Differential Privacy (DP), and provide both utility and privacy guarantees for the updated algorithm. Finally, we empirically verify the effectiveness of our attack and defense on benchmark datasets. Altogether, our findings suggest that all vertical federated learning frameworks that solely depend on HE might contain severe privacy risks, and DP, which has already demonstrated its power in horizontal federated learning, can also play a crucial role in the vertical setting, especially when coupled with HE or secure multi-party computation (MPC) techniques.

Via

Access Paper or Ask Questions

Forecasting Future World Events with Neural Networks

Jun 30, 2022
Andy Zou, Tristan Xiao, Ryan Jia, Joe Kwon, Mantas Mazeika, Richard Li, Dawn Song, Jacob Steinhardt, Owain Evans, Dan Hendrycks

Figure 1 for Forecasting Future World Events with Neural Networks

Figure 2 for Forecasting Future World Events with Neural Networks

Figure 3 for Forecasting Future World Events with Neural Networks

Figure 4 for Forecasting Future World Events with Neural Networks

Forecasting future world events is a challenging but valuable task. Forecasts of climate, geopolitical conflict, pandemics and economic indicators help shape policy and decision making. In these domains, the judgment of expert humans contributes to the best forecasts. Given advances in language modeling, can these forecasts be automated? To this end, we introduce Autocast, a dataset containing thousands of forecasting questions and an accompanying news corpus. Questions are taken from forecasting tournaments, ensuring high quality, real-world importance, and diversity. The news corpus is organized by date, allowing us to precisely simulate the conditions under which humans made past forecasts (avoiding leakage from the future). Motivated by the difficulty of forecasting numbers across orders of magnitude (e.g. global cases of COVID-19 in 2022), we also curate IntervalQA, a dataset of numerical questions and metrics for calibration. We test language models on our forecasting task and find that performance is far below a human expert baseline. However, performance improves with increased model size and incorporation of relevant information from the news corpus. In sum, Autocast poses a novel challenge for large language models and improved performance could bring large practical benefits.

* Code and the Autocast dataset are available at https://github.com/andyzoujm/autocast

Via

Access Paper or Ask Questions

Byzantine-Robust Federated Learning with Optimal Statistical Rates and Privacy Guarantees

May 24, 2022
Banghua Zhu, Lun Wang, Qi Pang, Shuai Wang, Jiantao Jiao, Dawn Song, Michael I. Jordan

Figure 1 for Byzantine-Robust Federated Learning with Optimal Statistical Rates and Privacy Guarantees

Figure 2 for Byzantine-Robust Federated Learning with Optimal Statistical Rates and Privacy Guarantees

Figure 3 for Byzantine-Robust Federated Learning with Optimal Statistical Rates and Privacy Guarantees

We propose Byzantine-robust federated learning protocols with nearly optimal statistical rates. In contrast to prior work, our proposed protocols improve the dimension dependence and achieve a tight statistical rate in terms of all the parameters for strongly convex losses. We benchmark against competing protocols and show the empirical superiority of the proposed protocols. Finally, we remark that our protocols with bucketing can be naturally combined with privacy-guaranteeing procedures to introduce security against a semi-honest server. The code for evaluation is provided in https://github.com/wanglun1996/secure-robust-federated-learning.

Via

Access Paper or Ask Questions

DeepStruct: Pretraining of Language Models for Structure Prediction

May 21, 2022
Chenguang Wang, Xiao Liu, Zui Chen, Haoyun Hong, Jie Tang, Dawn Song

Figure 1 for DeepStruct: Pretraining of Language Models for Structure Prediction

Figure 2 for DeepStruct: Pretraining of Language Models for Structure Prediction

Figure 3 for DeepStruct: Pretraining of Language Models for Structure Prediction

Figure 4 for DeepStruct: Pretraining of Language Models for Structure Prediction

We introduce a method for improving the structural understanding abilities of language models. Unlike previous approaches that finetune the models with task-specific augmentation, we pretrain language models on a collection of task-agnostic corpora to generate structures from text. Our structure pretraining enables zero-shot transfer of the learned knowledge that models have about the structure tasks. We study the performance of this approach on 28 datasets, spanning 10 structure prediction tasks including open information extraction, joint entity and relation extraction, named entity recognition, relation classification, semantic role labeling, event extraction, coreference resolution, factual probe, intent detection, and dialogue state tracking. We further enhance the pretraining with the task-specific training sets. We show that a 10B parameter language model transfers non-trivially to most tasks and obtains state-of-the-art performance on 21 of 28 datasets that we evaluate.

* ACL 2022

Via

Access Paper or Ask Questions

PixMix: Dreamlike Pictures Comprehensively Improve Safety Measures

Dec 11, 2021
Dan Hendrycks, Andy Zou, Mantas Mazeika, Leonard Tang, Bo Li, Dawn Song, Jacob Steinhardt

Figure 1 for PixMix: Dreamlike Pictures Comprehensively Improve Safety Measures

Figure 2 for PixMix: Dreamlike Pictures Comprehensively Improve Safety Measures

Figure 3 for PixMix: Dreamlike Pictures Comprehensively Improve Safety Measures

Figure 4 for PixMix: Dreamlike Pictures Comprehensively Improve Safety Measures

In real-world applications of machine learning, reliable and safe systems must consider measures of performance beyond standard test set accuracy. These other goals include out-of-distribution (OOD) robustness, prediction consistency, resilience to adversaries, calibrated uncertainty estimates, and the ability to detect anomalous inputs. However, improving performance towards these goals is often a balancing act that today's methods cannot achieve without sacrificing performance on other safety axes. For instance, adversarial training improves adversarial robustness but sharply degrades other classifier performance metrics. Similarly, strong data augmentation and regularization techniques often improve OOD robustness but harm anomaly detection, raising the question of whether a Pareto improvement on all existing safety measures is possible. To meet this challenge, we design a new data augmentation strategy utilizing the natural structural complexity of pictures such as fractals, which outperforms numerous baselines, is near Pareto-optimal, and roundly improves safety measures.

* Code and models are available at https://github.com/andyzoujm/pixmix

Via

Access Paper or Ask Questions

Grounded Graph Decoding Improves Compositional Generalization in Question Answering

Nov 05, 2021
Yu Gai, Paras Jain, Wendi Zhang, Joseph E. Gonzalez, Dawn Song, Ion Stoica

Figure 1 for Grounded Graph Decoding Improves Compositional Generalization in Question Answering

Figure 2 for Grounded Graph Decoding Improves Compositional Generalization in Question Answering

Figure 3 for Grounded Graph Decoding Improves Compositional Generalization in Question Answering

Figure 4 for Grounded Graph Decoding Improves Compositional Generalization in Question Answering

Question answering models struggle to generalize to novel compositions of training patterns, such to longer sequences or more complex test structures. Current end-to-end models learn a flat input embedding which can lose input syntax context. Prior approaches improve generalization by learning permutation invariant models, but these methods do not scale to more complex train-test splits. We propose Grounded Graph Decoding, a method to improve compositional generalization of language representations by grounding structured predictions with an attention mechanism. Grounding enables the model to retain syntax information from the input in thereby significantly improving generalization over complex inputs. By predicting a structured graph containing conjunctions of query clauses, we learn a group invariant representation without making assumptions on the target domain. Our model significantly outperforms state-of-the-art baselines on the Compositional Freebase Questions (CFQ) dataset, a challenging benchmark for compositional generalization in question answering. Moreover, we effectively solve the MCD1 split with 98% accuracy.

* To be published in Findings of EMNLP 2021. Code available at https://github.com/gaiyu0/cfq

Via

Access Paper or Ask Questions

What Would Jiminy Cricket Do? Towards Agents That Behave Morally

Oct 25, 2021
Dan Hendrycks, Mantas Mazeika, Andy Zou, Sahil Patel, Christine Zhu, Jesus Navarro, Dawn Song, Bo Li, Jacob Steinhardt

Figure 1 for What Would Jiminy Cricket Do? Towards Agents That Behave Morally

Figure 2 for What Would Jiminy Cricket Do? Towards Agents That Behave Morally

Figure 3 for What Would Jiminy Cricket Do? Towards Agents That Behave Morally

Figure 4 for What Would Jiminy Cricket Do? Towards Agents That Behave Morally

When making everyday decisions, people are guided by their conscience, an internal sense of right and wrong. By contrast, artificial agents are currently not endowed with a moral sense. As a consequence, they may learn to behave immorally when trained on environments that ignore moral concerns, such as violent video games. With the advent of generally capable agents that pretrain on many environments, it will become necessary to mitigate inherited biases from environments that teach immoral behavior. To facilitate the development of agents that avoid causing wanton harm, we introduce Jiminy Cricket, an environment suite of 25 text-based adventure games with thousands of diverse, morally salient scenarios. By annotating every possible game state, the Jiminy Cricket environments robustly evaluate whether agents can act morally while maximizing reward. Using models with commonsense moral knowledge, we create an elementary artificial conscience that assesses and guides agents. In extensive experiments, we find that the artificial conscience approach can steer agents towards moral behavior without sacrificing performance.

* NeurIPS 2021. Environments available here https://github.com/hendrycks/jiminy-cricket

Via

Access Paper or Ask Questions

Zero-Shot Information Extraction as a Unified Text-to-Triple Translation

Sep 23, 2021
Chenguang Wang, Xiao Liu, Zui Chen, Haoyun Hong, Jie Tang, Dawn Song

Figure 1 for Zero-Shot Information Extraction as a Unified Text-to-Triple Translation

Figure 2 for Zero-Shot Information Extraction as a Unified Text-to-Triple Translation

Figure 3 for Zero-Shot Information Extraction as a Unified Text-to-Triple Translation

Figure 4 for Zero-Shot Information Extraction as a Unified Text-to-Triple Translation

We cast a suite of information extraction tasks into a text-to-triple translation framework. Instead of solving each task relying on task-specific datasets and models, we formalize the task as a translation between task-specific input text and output triples. By taking the task-specific input, we enable a task-agnostic translation by leveraging the latent knowledge that a pre-trained language model has about the task. We further demonstrate that a simple pre-training task of predicting which relational information corresponds to which input text is an effective way to produce task-specific outputs. This enables the zero-shot transfer of our framework to downstream tasks. We study the zero-shot performance of this framework on open information extraction (OIE2016, NYT, WEB, PENN), relation classification (FewRel and TACRED), and factual probe (Google-RE and T-REx). The model transfers non-trivially to most tasks and is often competitive with a fully supervised method without the need for any task-specific training. For instance, we significantly outperform the F1 score of the supervised open information extraction without needing to use its training set.

* EMNLP 2021; 14 pages, 5 figures, 9 tables

Via

Access Paper or Ask Questions

RobustART: Benchmarking Robustness on Architecture Design and Training Techniques

Sep 15, 2021
Shiyu Tang, Ruihao Gong, Yan Wang, Aishan Liu, Jiakai Wang, Xinyun Chen, Fengwei Yu, Xianglong Liu, Dawn Song, Alan Yuille, Philip H. S. Torr, Dacheng Tao

Figure 1 for RobustART: Benchmarking Robustness on Architecture Design and Training Techniques

Figure 2 for RobustART: Benchmarking Robustness on Architecture Design and Training Techniques

Figure 3 for RobustART: Benchmarking Robustness on Architecture Design and Training Techniques

Figure 4 for RobustART: Benchmarking Robustness on Architecture Design and Training Techniques

Deep neural networks (DNNs) are vulnerable to adversarial noises, which motivates the benchmark of model robustness. Existing benchmarks mainly focus on evaluating the defenses, but there are no comprehensive studies of how architecture design and general training techniques affect robustness. Comprehensively benchmarking their relationships will be highly beneficial for better understanding and developing robust DNNs. Thus, we propose RobustART, the first comprehensive Robustness investigation benchmark on ImageNet (including open-source toolkit, pre-trained model zoo, datasets, and analyses) regarding ARchitecture design (44 human-designed off-the-shelf architectures and 1200+ networks from neural architecture search) and Training techniques (10+ general techniques, e.g., data augmentation) towards diverse noises (adversarial, natural, and system noises). Extensive experiments revealed and substantiated several insights for the first time, for example: (1) adversarial training largely improves the clean accuracy and all types of robustness for Transformers and MLP-Mixers; (2) with comparable sizes, CNNs > Transformers > MLP-Mixers on robustness against natural and system noises; Transformers > MLP-Mixers > CNNs on adversarial robustness; (3) for some light-weight architectures (e.g., EfficientNet, MobileNetV2, and MobileNetV3), increasing model sizes or using extra training data cannot improve robustness. Our benchmark http://robust.art/ : (1) presents an open-source platform for conducting comprehensive evaluation on diverse robustness types; (2) provides a variety of pre-trained models with different training techniques to facilitate robustness evaluation; (3) proposes a new view to better understand the mechanism towards designing robust DNN architectures, backed up by the analysis. We will continuously contribute to building this ecosystem for the community.

Via

Access Paper or Ask Questions

Latent Execution for Neural Program Synthesis Beyond Domain-Specific Languages

Jun 29, 2021
Xinyun Chen, Dawn Song, Yuandong Tian

Figure 1 for Latent Execution for Neural Program Synthesis Beyond Domain-Specific Languages

Figure 2 for Latent Execution for Neural Program Synthesis Beyond Domain-Specific Languages

Figure 3 for Latent Execution for Neural Program Synthesis Beyond Domain-Specific Languages

Figure 4 for Latent Execution for Neural Program Synthesis Beyond Domain-Specific Languages

Program synthesis from input-output examples has been a long-standing challenge, and recent works have demonstrated some success in designing deep neural networks for program synthesis. However, existing efforts in input-output neural program synthesis have been focusing on domain-specific languages, thus the applicability of previous approaches to synthesize code in full-fledged popular programming languages, such as C, remains a question. The main challenges lie in two folds. On the one hand, the program search space grows exponentially when the syntax and semantics of the programming language become more complex, which poses higher requirements on the synthesis algorithm. On the other hand, increasing the complexity of the programming language also imposes more difficulties on data collection, since building a large-scale training set for input-output program synthesis require random program generators to sample programs and input-output examples. In this work, we take the first step to synthesize C programs from input-output examples. In particular, we propose LaSynth, which learns the latent representation to approximate the execution of partially generated programs, even if their semantics are not well-defined. We demonstrate the possibility of synthesizing elementary C code from input-output examples, and leveraging learned execution significantly improves the prediction performance over existing approaches. Meanwhile, compared to the randomly generated ground-truth programs, LaSynth synthesizes more concise programs that resemble human-written code. We show that training on these synthesized programs further improves the prediction performance for both Karel and C program synthesis, indicating the promise of leveraging the learned program synthesizer to improve the dataset quality for input-output program synthesis.

Via

Access Paper or Ask Questions