Alert button
Picture for Haifeng Qian

Haifeng Qian

Alert button

Greener yet Powerful: Taming Large Code Generation Models with Quantization

Mar 09, 2023
Xiaokai Wei, Sujan Gonugondla, Wasi Ahmad, Shiqi Wang, Baishakhi Ray, Haifeng Qian, Xiaopeng Li, Varun Kumar, Zijian Wang, Yuchen Tian, Qing Sun, Ben Athiwaratkun, Mingyue Shang, Murali Krishna Ramanathan, Parminder Bhatia, Bing Xiang

Figure 1 for Greener yet Powerful: Taming Large Code Generation Models with Quantization
Figure 2 for Greener yet Powerful: Taming Large Code Generation Models with Quantization
Figure 3 for Greener yet Powerful: Taming Large Code Generation Models with Quantization
Figure 4 for Greener yet Powerful: Taming Large Code Generation Models with Quantization

ML-powered code generation aims to assist developers to write code in a more productive manner, by intelligently generating code blocks based on natural language prompts. Recently, large pretrained deep learning models have substantially pushed the boundary of code generation and achieved impressive performance. Despite their great power, the huge number of model parameters poses a significant threat to adapting them in a regular software development environment, where a developer might use a standard laptop or mid-size server to develop her code. Such large models incur significant resource usage (in terms of memory, latency, and dollars) as well as carbon footprint. Model compression is a promising approach to address these challenges. Several techniques are proposed to compress large pretrained models typically used for vision or textual data. Out of many available compression techniques, we identified that quantization is mostly applicable for code generation task as it does not require significant retraining cost. As quantization represents model parameters with lower-bit integer (e.g., int8), the model size and runtime latency would both benefit from such int representation. We extensively study the impact of quantized model on code generation tasks across different dimension: (i) resource usage and carbon footprint, (ii) accuracy, and (iii) robustness. To this end, through systematic experiments we find a recipe of quantization technique that could run even a $6$B model in a regular laptop without significant accuracy or robustness degradation. We further found the recipe is readily applicable to code summarization task as well.

* 10 pages, 7 figures, 10 tables 
Viaarxiv icon

ReCode: Robustness Evaluation of Code Generation Models

Dec 20, 2022
Shiqi Wang, Zheng Li, Haifeng Qian, Chenghao Yang, Zijian Wang, Mingyue Shang, Varun Kumar, Samson Tan, Baishakhi Ray, Parminder Bhatia, Ramesh Nallapati, Murali Krishna Ramanathan, Dan Roth, Bing Xiang

Figure 1 for ReCode: Robustness Evaluation of Code Generation Models
Figure 2 for ReCode: Robustness Evaluation of Code Generation Models
Figure 3 for ReCode: Robustness Evaluation of Code Generation Models
Figure 4 for ReCode: Robustness Evaluation of Code Generation Models

Code generation models have achieved impressive performance. However, they tend to be brittle as slight edits to a prompt could lead to very different generations; these robustness properties, critical for user experience when deployed in real-life applications, are not well understood. Most existing works on robustness in text or code tasks have focused on classification, while robustness in generation tasks is an uncharted area and to date there is no comprehensive benchmark for robustness in code generation. In this paper, we propose ReCode, a comprehensive robustness evaluation benchmark for code generation models. We customize over 30 transformations specifically for code on docstrings, function and variable names, code syntax, and code format. They are carefully designed to be natural in real-life coding practice, preserve the original semantic meaning, and thus provide multifaceted assessments of a model's robustness performance. With human annotators, we verified that over 90% of the perturbed prompts do not alter the semantic meaning of the original prompt. In addition, we define robustness metrics for code generation models considering the worst-case behavior under each type of perturbation, taking advantage of the fact that executing the generated code can serve as objective evaluation. We demonstrate ReCode on SOTA models using HumanEval, MBPP, as well as function completion tasks derived from them. Interesting observations include: better robustness for CodeGen over InCoder and GPT-J; models are most sensitive to syntax perturbations; more challenging robustness evaluation on MBPP over HumanEval.

* Code and data available at https://github.com/amazon-science/recode 
Viaarxiv icon

Multi-lingual Evaluation of Code Generation Models

Oct 26, 2022
Ben Athiwaratkun, Sanjay Krishna Gouda, Zijian Wang, Xiaopeng Li, Yuchen Tian, Ming Tan, Wasi Uddin Ahmad, Shiqi Wang, Qing Sun, Mingyue Shang, Sujan Kumar Gonugondla, Hantian Ding, Varun Kumar, Nathan Fulton, Arash Farahani, Siddhartha Jain, Robert Giaquinto, Haifeng Qian, Murali Krishna Ramanathan, Ramesh Nallapati, Baishakhi Ray, Parminder Bhatia, Sudipta Sengupta, Dan Roth, Bing Xiang

Figure 1 for Multi-lingual Evaluation of Code Generation Models
Figure 2 for Multi-lingual Evaluation of Code Generation Models
Figure 3 for Multi-lingual Evaluation of Code Generation Models
Figure 4 for Multi-lingual Evaluation of Code Generation Models

We present MBXP, an execution-based code completion benchmark in 10+ programming languages. This collection of datasets is generated by our conversion framework that translates prompts and test cases from the original MBPP dataset to the corresponding data in a target language. Based on this benchmark, we are able to evaluate code generation models in a multi-lingual fashion, and in particular discover generalization ability of language models on out-of-domain languages, advantages of large multi-lingual models over mono-lingual, benefits of few-shot prompting, and zero-shot translation abilities. In addition, we use our code generation model to perform large-scale bootstrapping to obtain synthetic canonical solutions in several languages. These solutions can be used for other code-related evaluations such as insertion-based, summarization, or code translation tasks where we demonstrate results and release as part of our benchmark.

* Code and data release: https://github.com/amazon-research/mbxp-exec-eval 
Viaarxiv icon

Logical Credal Networks

Sep 25, 2021
Haifeng Qian, Radu Marinescu, Alexander Gray, Debarun Bhattacharjya, Francisco Barahona, Tian Gao, Ryan Riegel, Pravinda Sahu

Figure 1 for Logical Credal Networks
Figure 2 for Logical Credal Networks
Figure 3 for Logical Credal Networks
Figure 4 for Logical Credal Networks

This paper introduces Logical Credal Networks, an expressive probabilistic logic that generalizes many prior models that combine logic and probability. Given imprecise information represented by probability bounds and conditional probability bounds of logic formulas, this logic specifies a set of probability distributions over all interpretations. On the one hand, our approach allows propositional and first-order logic formulas with few restrictions, e.g., without requiring acyclicity. On the other hand, it has a Markov condition similar to Bayesian networks and Markov random fields that is critical in real-world applications. Having both these properties makes this logic unique, and we investigate its performance on maximum a posteriori inference tasks, including solving Mastermind games with uncertainty and detecting credit card fraud. The results show that the proposed method outperforms existing approaches, and its advantage lies in aggregating multiple sources of imprecise information.

Viaarxiv icon

An Ensemble Approach Towards Adversarial Robustness

Jun 10, 2021
Haifeng Qian

Figure 1 for An Ensemble Approach Towards Adversarial Robustness
Figure 2 for An Ensemble Approach Towards Adversarial Robustness
Figure 3 for An Ensemble Approach Towards Adversarial Robustness
Figure 4 for An Ensemble Approach Towards Adversarial Robustness

It is a known phenomenon that adversarial robustness comes at a cost to natural accuracy. To improve this trade-off, this paper proposes an ensemble approach that divides a complex robust-classification task into simpler subtasks. Specifically, fractal divide derives multiple training sets from the training data, and fractal aggregation combines inference outputs from multiple classifiers that are trained on those sets. The resulting ensemble classifiers have a unique property that ensures robustness for an input if certain don't-care conditions are met. The new techniques are evaluated on MNIST and Fashion-MNIST, with no adversarial training. The MNIST classifier has 99% natural accuracy, 70% measured robustness and 36.9% provable robustness, within L2 distance of 2. The Fashion-MNIST classifier has 90% natural accuracy, 54.5% measured robustness and 28.2% provable robustness, within L2 distance of 1.5. Both results are new state of the art, and we also present new state-of-the-art binary results on challenging label-pairs.

Viaarxiv icon

Oriole: Thwarting Privacy against Trustworthy Deep Learning Models

Feb 23, 2021
Liuqiao Chen, Hu Wang, Benjamin Zi Hao Zhao, Minhui Xue, Haifeng Qian

Figure 1 for Oriole: Thwarting Privacy against Trustworthy Deep Learning Models
Figure 2 for Oriole: Thwarting Privacy against Trustworthy Deep Learning Models
Figure 3 for Oriole: Thwarting Privacy against Trustworthy Deep Learning Models
Figure 4 for Oriole: Thwarting Privacy against Trustworthy Deep Learning Models

Deep Neural Networks have achieved unprecedented success in the field of face recognition such that any individual can crawl the data of others from the Internet without their explicit permission for the purpose of training high-precision face recognition models, creating a serious violation of privacy. Recently, a well-known system named Fawkes (published in USENIX Security 2020) claimed this privacy threat can be neutralized by uploading cloaked user images instead of their original images. In this paper, we present Oriole, a system that combines the advantages of data poisoning attacks and evasion attacks, to thwart the protection offered by Fawkes, by training the attacker face recognition model with multi-cloaked images generated by Oriole. Consequently, the face recognition accuracy of the attack model is maintained and the weaknesses of Fawkes are revealed. Experimental results show that our proposed Oriole system is able to effectively interfere with the performance of the Fawkes system to achieve promising attacking results. Our ablation study highlights multiple principal factors that affect the performance of the Oriole system, including the DSSIM perturbation budget, the ratio of leaked clean user images, and the numbers of multi-cloaks for each uncloaked image. We also identify and discuss at length the vulnerabilities of Fawkes. We hope that the new methodology presented in this paper will inform the security community of a need to design more robust privacy-preserving deep learning models.

Viaarxiv icon

Logical Neural Networks

Jun 23, 2020
Ryan Riegel, Alexander Gray, Francois Luus, Naweed Khan, Ndivhuwo Makondo, Ismail Yunus Akhalwaya, Haifeng Qian, Ronald Fagin, Francisco Barahona, Udit Sharma, Shajith Ikbal, Hima Karanam, Sumit Neelam, Ankita Likhyani, Santosh Srivastava

Figure 1 for Logical Neural Networks
Figure 2 for Logical Neural Networks
Figure 3 for Logical Neural Networks
Figure 4 for Logical Neural Networks

We propose a novel framework seamlessly providing key properties of both neural nets (learning) and symbolic logic (knowledge and reasoning). Every neuron has a meaning as a component of a formula in a weighted real-valued logic, yielding a highly intepretable disentangled representation. Inference is omnidirectional rather than focused on predefined target variables, and corresponds to logical reasoning, including classical first-order logic theorem proving as a special case. The model is end-to-end differentiable, and learning minimizes a novel loss function capturing logical contradiction, yielding resilience to inconsistent knowledge. It also enables the open-world assumption by maintaining bounds on truth values which can have probabilistic semantics, yielding resilience to incomplete knowledge.

* 10 pages (incl. references), 38 pages supplementary, 7 figures, 9 tables, 6 algorithms. In submission to NeurIPS 2020 
Viaarxiv icon

With Great Dispersion Comes Greater Resilience: Efficient Poisoning Attacks and Defenses for Online Regression Models

Jun 23, 2020
Jialin Wen, Benjamin Zi Hao Zhao, Minhui Xue, Haifeng Qian

Figure 1 for With Great Dispersion Comes Greater Resilience: Efficient Poisoning Attacks and Defenses for Online Regression Models
Figure 2 for With Great Dispersion Comes Greater Resilience: Efficient Poisoning Attacks and Defenses for Online Regression Models
Figure 3 for With Great Dispersion Comes Greater Resilience: Efficient Poisoning Attacks and Defenses for Online Regression Models
Figure 4 for With Great Dispersion Comes Greater Resilience: Efficient Poisoning Attacks and Defenses for Online Regression Models

With the rise of third parties in the machine learning pipeline, the service provider in "Machine Learning as a Service'' (MLaaS), or external data contributors in online learning, or the retraining of existing models, the need to ensure the security of the resulting machine learning models has become an increasingly important topic. The security community has demonstrated that without transparency of the data and the resulting model, there exist many potential security risks, with new risks constantly being discovered. In this paper, we focus on one of these security risks -- poisoning attacks. Specifically, we analyze how attackers may interfere with the results of regression learning by poisoning the training datasets. To this end, we analyze and develop a new poisoning attack algorithm. Our attack, termed Nopt, in contrast with previous poisoning attack algorithms, can produce larger errors with the same proportion of poisoning data-points. Furthermore, we also significantly improve the state-of-the-art defense algorithm, termed TRIM, proposed by Jagielsk et al. (IEEE S&P 2018), by incorporating the concept of probability estimation of unpolluted data-points into the algorithm. Our new defense algorithm, termed Proda, demonstrates an increased effectiveness in reducing errors arising from the poisoning dataset. We highlight that the time complexity of TRIM had not been estimated; however, we deduce from their work that TRIM can take exponential time complexity in the worst-case scenario, in excess of Proda's logarithmic time. The performance of both our proposed attack and defense algorithms is extensively evaluated on four real-world datasets of housing prices, loans, health care, and bike sharing services. We hope that our work will inspire future research to develop more robust learning algorithms immune to poisoning attacks.

Viaarxiv icon

Neural Belief Reasoner

Sep 10, 2019
Haifeng Qian

Figure 1 for Neural Belief Reasoner
Figure 2 for Neural Belief Reasoner
Figure 3 for Neural Belief Reasoner
Figure 4 for Neural Belief Reasoner

This paper proposes a new generative model called neural belief reasoner (NBR). It differs from previous models in that it specifies a belief function rather than a probability distribution. Its implementation consists of neural networks, fuzzy-set operations and belief-function operations, and query-answering, sample-generation and training algorithms are presented. This paper studies NBR in two tasks. The first is a synthetic unsupervised-learning task, which demonstrates NBR's ability to perform multi-hop reasoning, reasoning with uncertainty and reasoning about conflicting information. The second is supervised learning: a robust MNIST classifier. Without any adversarial training, this classifier exceeds the state of the art in adversarial robustness as measured by the L2 metric, and at the same time maintains 99% accuracy on natural images. A proof is presented that, as capacity increases, NBR classifiers can asymptotically approach the best possible robustness.

Viaarxiv icon

L2-Nonexpansive Neural Networks

Jul 03, 2018
Haifeng Qian, Mark N. Wegman

Figure 1 for L2-Nonexpansive Neural Networks
Figure 2 for L2-Nonexpansive Neural Networks
Figure 3 for L2-Nonexpansive Neural Networks
Figure 4 for L2-Nonexpansive Neural Networks

This paper proposes a class of well-conditioned neural networks in which a unit amount of change in the inputs causes at most a unit amount of change in the outputs or any of the internal layers. We develop the known methodology of controlling Lipschitz constants to realize its full potential in maximizing robustness, with a new regularization scheme for linear layers, new ways to adapt nonlinearities and a new loss function. With MNIST and CIFAR-10 classifiers, we demonstrate a number of advantages. Without needing any adversarial training, the proposed classifiers exceed the state of the art in robustness against white-box L2-bounded adversarial attacks. Their outputs are quantitatively more meaningful than ordinary networks and indicate levels of confidence and generalization. They are also free of exploding gradients, among other desirable properties.

Viaarxiv icon