Alert button
Picture for Martin Vechev

Martin Vechev

Alert button

Programmable Synthetic Tabular Data Generation

Jul 10, 2023
Mark Vero, Mislav Balunović, Martin Vechev

Figure 1 for Programmable Synthetic Tabular Data Generation
Figure 2 for Programmable Synthetic Tabular Data Generation
Figure 3 for Programmable Synthetic Tabular Data Generation
Figure 4 for Programmable Synthetic Tabular Data Generation

Large amounts of tabular data remain underutilized due to privacy, data quality, and data sharing limitations. While training a generative model producing synthetic data resembling the original distribution addresses some of these issues, most applications require additional constraints from the generated data. Existing synthetic data approaches are limited as they typically only handle specific constraints, e.g., differential privacy (DP) or increased fairness, and lack an accessible interface for declaring general specifications. In this work, we introduce ProgSyn, the first programmable synthetic tabular data generation algorithm that allows for comprehensive customization over the generated data. To ensure high data quality while adhering to custom specifications, ProgSyn pre-trains a generative model on the original dataset and fine-tunes it on a differentiable loss automatically derived from the provided specifications. These can be programmatically declared using statistical and logical expressions, supporting a wide range of requirements (e.g., DP or fairness, among others). We conduct an extensive experimental evaluation of ProgSyn on a number of constraints, achieving a new state-of-the-art on some, while remaining general. For instance, at the same fairness level we achieve 2.3% higher downstream accuracy than the state-of-the-art in fair synthetic data generation on the Adult dataset. Overall, ProgSyn provides a versatile and accessible framework for generating constrained synthetic tabular data, allowing for specifications that generalize beyond the capabilities of prior work.

Viaarxiv icon

Understanding Certified Training with Interval Bound Propagation

Jun 17, 2023
Yuhao Mao, Mark Niklas Müller, Marc Fischer, Martin Vechev

Figure 1 for Understanding Certified Training with Interval Bound Propagation
Figure 2 for Understanding Certified Training with Interval Bound Propagation
Figure 3 for Understanding Certified Training with Interval Bound Propagation
Figure 4 for Understanding Certified Training with Interval Bound Propagation

As robustness verification methods are becoming more precise, training certifiably robust neural networks is becoming ever more relevant. To this end, certified training methods compute and then optimize an upper bound on the worst-case loss over a robustness specification. Curiously, training methods based on the imprecise interval bound propagation (IBP) consistently outperform those leveraging more precise bounding methods. Still, we lack an understanding of the mechanisms making IBP so successful. In this work, we thoroughly investigate these mechanisms by leveraging a novel metric measuring the tightness of IBP bounds. We first show theoretically that, for deep linear models, tightness decreases with width and depth at initialization, but improves with IBP training, given sufficient network width. We, then, derive sufficient and necessary conditions on weight matrices for IBP bounds to become exact and demonstrate that these impose strong regularization, explaining the empirically observed trade-off between robustness and accuracy in certified training. Our extensive experimental evaluation validates our theoretical predictions for ReLU networks, including that wider networks improve performance, yielding state-of-the-art results. Interestingly, we observe that while all IBP-based training methods lead to high tightness, this is neither sufficient nor necessary to achieve high certifiable robustness. This hints at the existence of new training methods that do not induce the strong regularization required for tight IBP bounds, leading to improved robustness and standard accuracy.

Viaarxiv icon

Hiding in Plain Sight: Disguising Data Stealing Attacks in Federated Learning

Jun 16, 2023
Kostadin Garov, Dimitar I. Dimitrov, Nikola Jovanović, Martin Vechev

Figure 1 for Hiding in Plain Sight: Disguising Data Stealing Attacks in Federated Learning
Figure 2 for Hiding in Plain Sight: Disguising Data Stealing Attacks in Federated Learning
Figure 3 for Hiding in Plain Sight: Disguising Data Stealing Attacks in Federated Learning
Figure 4 for Hiding in Plain Sight: Disguising Data Stealing Attacks in Federated Learning

Malicious server (MS) attacks have enabled the scaling of data stealing in federated learning to large batch sizes and secure aggregation, settings previously considered private. However, many concerns regarding client-side detectability of MS attacks were raised, questioning their practicality once they are publicly known. In this work, for the first time, we thoroughly study the problem of client-side detectability.We demonstrate that most prior MS attacks, which fundamentally rely on one of two key principles, are detectable by principled client-side checks. Further, we formulate desiderata for practical MS attacks and propose SEER, a novel attack framework that satisfies all desiderata, while stealing user data from gradients of realistic networks, even for large batch sizes (up to 512 in our experiments) and under secure aggregation. The key insight of SEER is the use of a secret decoder, which is jointly trained with the shared model. Our work represents a promising first step towards more principled treatment of MS attacks, paving the way for realistic data stealing that can compromise user privacy in real-world deployments.

Viaarxiv icon

Incentivizing Honesty among Competitors in Collaborative Learning and Optimization

May 25, 2023
Florian E. Dorner, Nikola Konstantinov, Georgi Pashaliev, Martin Vechev

Figure 1 for Incentivizing Honesty among Competitors in Collaborative Learning and Optimization
Figure 2 for Incentivizing Honesty among Competitors in Collaborative Learning and Optimization
Figure 3 for Incentivizing Honesty among Competitors in Collaborative Learning and Optimization

Collaborative learning techniques have the potential to enable training machine learning models that are superior to models trained on a single entity's data. However, in many cases, potential participants in such collaborative schemes are competitors on a downstream task, such as firms that each aim to attract customers by providing the best recommendations. This can incentivize dishonest updates that damage other participants' models, potentially undermining the benefits of collaboration. In this work, we formulate a game that models such interactions and study two learning tasks within this framework: single-round mean estimation and multi-round SGD on strongly-convex objectives. For a natural class of player actions, we show that rational clients are incentivized to strongly manipulate their updates, preventing learning. We then propose mechanisms that incentivize honest communication and ensure learning quality comparable to full cooperation. Lastly, we empirically demonstrate the effectiveness of our incentive scheme on a standard non-convex federated learning benchmark. Our work shows that explicitly modeling the incentives and actions of dishonest clients, rather than assuming them malicious, can enable strong robustness guarantees for collaborative learning.

* 37 pages, 5 figures 
Viaarxiv icon

Self-contradictory Hallucinations of Large Language Models: Evaluation, Detection and Mitigation

May 25, 2023
Niels Mündler, Jingxuan He, Slobodan Jenko, Martin Vechev

Figure 1 for Self-contradictory Hallucinations of Large Language Models: Evaluation, Detection and Mitigation
Figure 2 for Self-contradictory Hallucinations of Large Language Models: Evaluation, Detection and Mitigation
Figure 3 for Self-contradictory Hallucinations of Large Language Models: Evaluation, Detection and Mitigation
Figure 4 for Self-contradictory Hallucinations of Large Language Models: Evaluation, Detection and Mitigation

Large language models (large LMs) are susceptible to producing text with hallucinated content. Self-contradiction, where the LM generates two contradictory sentences within the same context, is an important form of hallucination. In this work, we present a comprehensive analysis on self-contradiction for state-of-the-art, instruction-tuned LMs, including evaluation, detection, and mitigation. To effectively trigger self-contradictions, we design a framework that constrains LMs to generate appropriate sentence pairs. Our evaluation on these sentence pairs reveals that self-contradictions occur frequently across different LMs for both famous and lesser-known topics. Next, we prompt the LMs to detect self-contradictions. Our results indicate that ChatGPT and GPT-4 are able to accurately identify self-contradictions, while Vicuna-13B struggles to do so. For example, with our best prompting method, ChatGPT achieves 91.0% precision and 80.5% recall on the sentence pairs generated by itself. To automatically mitigate self-contradictions, we develop an iterative algorithm that prompts the LMs to remove the detected self-contradictions from the generated text. Our algorithm successfully revises the text such that self-contradictions are significantly reduced, while maintaining its fluency and informativeness. Importantly, our entire pipeline of triggering, detecting, and mitigating self-contradictions is applicable to black-box LMs and does not require any external grounded knowledge.

Viaarxiv icon

TAPS: Connecting Certified and Adversarial Training

May 08, 2023
Yuhao Mao, Mark Niklas Müller, Marc Fischer, Martin Vechev

Figure 1 for TAPS: Connecting Certified and Adversarial Training
Figure 2 for TAPS: Connecting Certified and Adversarial Training
Figure 3 for TAPS: Connecting Certified and Adversarial Training
Figure 4 for TAPS: Connecting Certified and Adversarial Training

Training certifiably robust neural networks remains a notoriously hard problem. On one side, adversarial training optimizes under-approximations of the worst-case loss, which leads to insufficient regularization for certification, while on the other, sound certified training methods optimize loose over-approximations, leading to over-regularization and poor (standard) accuracy. In this work we propose TAPS, an (unsound) certified training method that combines IBP and PGD training to yield precise, although not necessarily sound, worst-case loss approximations, reducing over-regularization and increasing certified and standard accuracies. Empirically, TAPS achieves a new state-of-the-art in many settings, e.g., reaching a certified accuracy of $22\%$ on TinyImageNet for $\ell_\infty$-perturbations with radius $\epsilon=1/255$.

Viaarxiv icon

Efficient Certified Training and Robustness Verification of Neural ODEs

Mar 09, 2023
Mustafa Zeqiri, Mark Niklas Müller, Marc Fischer, Martin Vechev

Figure 1 for Efficient Certified Training and Robustness Verification of Neural ODEs
Figure 2 for Efficient Certified Training and Robustness Verification of Neural ODEs
Figure 3 for Efficient Certified Training and Robustness Verification of Neural ODEs
Figure 4 for Efficient Certified Training and Robustness Verification of Neural ODEs

Neural Ordinary Differential Equations (NODEs) are a novel neural architecture, built around initial value problems with learned dynamics which are solved during inference. Thought to be inherently more robust against adversarial perturbations, they were recently shown to be vulnerable to strong adversarial attacks, highlighting the need for formal guarantees. However, despite significant progress in robustness verification for standard feed-forward architectures, the verification of high dimensional NODEs remains an open problem. In this work, we address this challenge and propose GAINS, an analysis framework for NODEs combining three key ideas: (i) a novel class of ODE solvers, based on variable but discrete time steps, (ii) an efficient graph representation of solver trajectories, and (iii) a novel abstraction algorithm operating on this graph representation. Together, these advances enable the efficient analysis and certified training of high-dimensional NODEs, by reducing the runtime from an intractable $O(\exp(d)+\exp(T))$ to ${O}(d+T^2 \log^2T)$ in the dimensionality $d$ and integration time $T$. In an extensive evaluation on computer vision (MNIST and FMNIST) and time-series forecasting (PHYSIO-NET) problems, we demonstrate the effectiveness of both our certified training and verification methods.

* Accepted at ICLR23 
Viaarxiv icon

Controlling Large Language Models to Generate Secure and Vulnerable Code

Feb 10, 2023
Jingxuan He, Martin Vechev

Figure 1 for Controlling Large Language Models to Generate Secure and Vulnerable Code
Figure 2 for Controlling Large Language Models to Generate Secure and Vulnerable Code
Figure 3 for Controlling Large Language Models to Generate Secure and Vulnerable Code
Figure 4 for Controlling Large Language Models to Generate Secure and Vulnerable Code

Large language models (LMs) are increasingly pretrained on massive corpora of open-source programs and applied to solve program synthesis tasks. However, a fundamental limitation of LMs is their unawareness of security and vulnerability during pretraining and inference. As a result, LMs produce secure or vulnerable programs with high uncertainty (e.g., around 60%/40% chances for GitHub Copilot according to a recent study). This greatly impairs LMs' usability, especially in security-sensitive scenarios. To address this limitation, this work formulates a new problem called controlled code generation, which allows users to input a boolean property into an LM to control if the LM generates secure or vulnerable code. We propose svGen, an effective and lightweight learning approach for solving controlled code generation. svGen leverages property-specific continuous vectors to steer program generation toward the given property, without altering the weights of the LM. svGen's training optimizes those continuous vectors by carefully applying specialized loss terms on different regions of code. Our extensive evaluation shows that svGen achieves strong control capability across various software vulnerabilities and LMs of different parameter sizes. For example, on 9 dangerous vulnerabilities, a state-of-the-art CodeGen LM with 2.7B parameters generates secure programs with a 57% chance. When we use svGen to control the LM to generate secure (resp., vulnerable) programs, the chance is significantly increased to 82% (resp., decreased to 35%).

Viaarxiv icon

Human-Guided Fair Classification for Natural Language Processing

Dec 20, 2022
Florian E. Dorner, Momchil Peychev, Nikola Konstantinov, Naman Goel, Elliott Ash, Martin Vechev

Figure 1 for Human-Guided Fair Classification for Natural Language Processing
Figure 2 for Human-Guided Fair Classification for Natural Language Processing
Figure 3 for Human-Guided Fair Classification for Natural Language Processing
Figure 4 for Human-Guided Fair Classification for Natural Language Processing

Text classifiers have promising applications in high-stake tasks such as resume screening and content moderation. These classifiers must be fair and avoid discriminatory decisions by being invariant to perturbations of sensitive attributes such as gender or ethnicity. However, there is a gap between human intuition about these perturbations and the formal similarity specifications capturing them. While existing research has started to address this gap, current methods are based on hardcoded word replacements, resulting in specifications with limited expressivity or ones that fail to fully align with human intuition (e.g., in cases of asymmetric counterfactuals). This work proposes novel methods for bridging this gap by discovering expressive and intuitive individual fairness specifications. We show how to leverage unsupervised style transfer and GPT-3's zero-shot capabilities to automatically generate expressive candidate pairs of semantically similar sentences that differ along sensitive attributes. We then validate the generated pairs via an extensive crowdsourcing study, which confirms that a lot of these pairs align with human intuition about fairness in the context of toxicity classification. Finally, we show how limited amounts of human feedback can be leveraged to learn a similarity specification that can be used to train downstream fairness-aware models.

* 31 pages, 1 figure 
Viaarxiv icon