Get our free extension to see links to code for papers anywhere online!Free extension: code links for papers anywhere!Free add-on: See code for papers anywhere!

Haoze Wu, Clark Barrett, Nina Narodytska

The demonstrated code-understanding capability of LLMs raises the question of whether they can be used for automated program verification, a task that often demands high-level abstract reasoning about program properties, which is challenging for verification tools. We propose a general methodology to combine the power of LLMs and automated reasoners for automated program verification. We formally describe this methodology as a set of derivation rules and prove its soundness. We instantiate the calculus as a sound automated verification procedure, which led to practical improvements on a set of synthetic and competition benchmarks.

Via

Sagar Patel, Sangeetha Abdu Jyothi, Nina Narodytska

Lack of explainability is a key factor limiting the practical adoption of high-performant Deep Reinforcement Learning (DRL) controllers. Explainable RL for networking hitherto used salient input features to interpret a controller's behavior. However, these feature-based solutions do not completely explain the controller's decision-making process. Often, operators are interested in understanding the impact of a controller's actions on performance in the future, which feature-based solutions cannot capture. In this paper, we present CrystalBox, a framework that explains a controller's behavior in terms of the future impact on key network performance metrics. CrystalBox employs a novel learning-based approach to generate succinct and expressive explanations. We use reward components of the DRL network controller, which are key performance metrics meaningful to operators, as the basis for explanations. CrystalBox is generalizable and can work across both discrete and continuous control environments without any changes to the controller or the DRL workflow. Using adaptive bitrate streaming and congestion control, we demonstrate CrytalBox's ability to generate high-fidelity future-based explanations. We additionally present three practical use cases of CrystalBox: cross-state explainability, guided reward design, and network observability.

Via

Sagar Patel, Junyang Zhang, Sangeetha Abdu Jyothi, Nina Narodytska

Deep Reinforcement Learning (DRL) based controllers offer high performance in a variety of network environments. However, simulator-based training of DRL controllers using highly skewed datasets of real-world traces often results in poor performance in the wild. In this paper, we put forward a generalizable solution for training high-performance DRL controllers in simulators -- Prioritized Trace Selection (PTS). PTS employs an automated three-stage process. First, we identify critical features that determine trace behavior. Second, we classify the traces into clusters. Finally, we dynamically identify and prioritize the salient clusters during training. PTS does not require any changes to the DRL workflow. It can work across both on-policy and off-policy DRL algorithms. We use Adaptive Bit Rate selection and Congestion Control as representative applications to show that PTS offers better performance in simulation and real-world, across multiple controllers and DRL algorithms. Our novel ABR controller, Gelato, trained with PTS outperforms state-of-the-art controllers on the real-world live-streaming platform, Puffer, reducing stalls by 59% and significantly improving average video quality.

Via

Yacine Izza, Xuanxiang Huang, Alexey Ignatiev, Nina Narodytska, Martin C. Cooper, Joao Marques-Silva

The most widely studied explainable AI (XAI) approaches are unsound. This is the case with well-known model-agnostic explanation approaches, and it is also the case with approaches based on saliency maps. One solution is to consider intrinsic interpretability, which does not exhibit the drawback of unsoundness. Unfortunately, intrinsic interpretability can display unwieldy explanation redundancy. Formal explainability represents the alternative to these non-rigorous approaches, with one example being PI-explanations. Unfortunately, PI-explanations also exhibit important drawbacks, the most visible of which is arguably their size. Recently, it has been observed that the (absolute) rigor of PI-explanations can be traded off for a smaller explanation size, by computing the so-called relevant sets. Given some positive {\delta}, a set S of features is {\delta}-relevant if, when the features in S are fixed, the probability of getting the target class exceeds {\delta}. However, even for very simple classifiers, the complexity of computing relevant sets of features is prohibitive, with the decision problem being NPPP-complete for circuit-based classifiers. In contrast with earlier negative results, this paper investigates practical approaches for computing relevant sets for a number of widely used classifiers that include Decision Trees (DTs), Naive Bayes Classifiers (NBCs), and several families of classifiers obtained from propositional languages. Moreover, the paper shows that, in practice, and for these families of classifiers, relevant sets are easy to compute. Furthermore, the experiments confirm that succinct sets of relevant features can be obtained for the families of classifiers considered.

Via

Jinqiang Yu, Alexey Ignatiev, Peter J. Stuckey, Nina Narodytska, Joao Marques-Silva

The rise of AI methods to make predictions and decisions has led to a pressing need for more explainable artificial intelligence (XAI) methods. One common approach for XAI is to produce a post-hoc explanation, explaining why a black box ML model made a certain prediction. Formal approaches to post-hoc explanations provide succinct reasons for why a prediction was made, as well as why not another prediction was made. But these approaches assume that features are independent and uniformly distributed. While this means that "why" explanations are correct, they may be longer than required. It also means the "why not" explanations may be suspect as the counterexamples they rely on may not be meaningful. In this paper, we show how one can apply background knowledge to give more succinct "why" formal explanations, that are presumably easier to interpret by humans, and give more accurate "why not" explanations. Furthermore, we also show how to use existing rule induction techniques to efficiently extract background information from a dataset, and also how to report which background information was used to make an explanation, allowing a human to examine it if they doubt the correctness of the explanation.

Via

Yacine Izza, Alexey Ignatiev, Nina Narodytska, Martin C. Cooper, Joao Marques-Silva

Decision trees (DTs) embody interpretable classifiers. DTs have been advocated for deployment in high-risk applications, but also for explaining other complex classifiers. Nevertheless, recent work has demonstrated that predictions in DTs ought to be explained with rigorous approaches. Although rigorous explanations can be computed in polynomial time for DTs, their size may be beyond the cognitive limits of human decision makers. This paper investigates the computation of {\delta}-relevant sets for DTs. {\delta}-relevant sets denote explanations that are succinct and provably precise. These sets represent generalizations of rigorous explanations, which are precise with probability one, and so they enable trading off explanation size for precision. The paper proposes two logic encodings for computing smallest {\delta}-relevant sets for DTs. The paper further devises a polynomial-time algorithm for computing {\delta}-relevant sets which are not guaranteed to be subset-minimal, but for which the experiments show to be most often subset-minimal in practice. The experimental results also demonstrate the practical efficiency of computing smallest {\delta}-relevant sets.

Via

Haoze Wu, Clark Barrett, Mahmood Sharif, Nina Narodytska, Gagandeep Singh

Recently, Graph Neural Networks (GNNs) have been applied for scheduling jobs over clusters achieving better performance than hand-crafted heuristics. Despite their impressive performance, concerns remain over their trustworthiness when deployed in a real-world environment due to their black-box nature. To address these limitations, we consider formal verification of their expected properties such as strategy proofness and locality in this work. We address several domain-specific challenges such as deeper networks and richer specifications not encountered by existing verifiers for image and NLP classifiers. We develop GNN-Verify, the first general framework for verifying both single-step and multi-step properties of these schedulers based on carefully designed algorithms that combine abstractions, refinements, solvers, and proof transfer. Our experimental results on challenging benchmarks show that our approach can provide precise and scalable formal guarantees on the trustworthiness of state-of-the-art GNN-based scheduler.

Via

Parikshit Gopalan, Nina Narodytska, Omer Reingold, Vatsal Sharan, Udi Wieder

Estimating the Kullback-Leibler (KL) divergence between two distributions given samples from them is well-studied in machine learning and information theory. Motivated by considerations of multi-group fairness, we seek KL divergence estimates that accurately reflect the contributions of sub-populations to the overall divergence. We model the sub-populations coming from a rich (possibly infinite) family $\mathcal{C}$ of overlapping subsets of the domain. We propose the notion of multi-group attribution for $\mathcal{C}$, which requires that the estimated divergence conditioned on every sub-population in $\mathcal{C}$ satisfies some natural accuracy and fairness desiderata, such as ensuring that sub-populations where the model predicts significant divergence do diverge significantly in the two distributions. Our main technical contribution is to show that multi-group attribution can be derived from the recently introduced notion of multi-calibration for importance weights [HKRR18, GRSW21]. We provide experimental evidence to support our theoretical results, and show that multi-group attribution provides better KL divergence estimates when conditioned on sub-populations than other popular algorithms.

Via

Yacine Izza, Alexey Ignatiev, Nina Narodytska, Martin C. Cooper, Joao Marques-Silva

Recent work proposed $\delta$-relevant inputs (or sets) as a probabilistic explanation for the predictions made by a classifier on a given input. $\delta$-relevant sets are significant because they serve to relate (model-agnostic) Anchors with (model-accurate) PI- explanations, among other explanation approaches. Unfortunately, the computation of smallest size $\delta$-relevant sets is complete for ${NP}^{PP}$, rendering their computation largely infeasible in practice. This paper investigates solutions for tackling the practical limitations of $\delta$-relevant sets. First, the paper alternatively considers the computation of subset-minimal sets. Second, the paper studies concrete families of classifiers, including decision trees among others. For these cases, the paper shows that the computation of subset-minimal $\delta$-relevant sets is in NP, and can be solved with a polynomial number of calls to an NP oracle. The experimental evaluation compares the proposed approach with heuristic explainers for the concrete case of the classifiers studied in the paper, and confirms the advantage of the proposed solution over the state of the art.

Via

Joao Marques-Silva, Thomas Gerspacher, Martin Cooper, Alexey Ignatiev, Nina Narodytska

In many classification tasks there is a requirement of monotonicity. Concretely, if all else remains constant, increasing (resp. decreasing) the value of one or more features must not decrease (resp. increase) the value of the prediction. Despite comprehensive efforts on learning monotonic classifiers, dedicated approaches for explaining monotonic classifiers are scarce and classifier-specific. This paper describes novel algorithms for the computation of one formal explanation of a (black-box) monotonic classifier. These novel algorithms are polynomial in the run time complexity of the classifier and the number of features. Furthermore, the paper presents a practically efficient model-agnostic algorithm for enumerating formal explanations.

Via