Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Mateja Jamnik

Distributed representations of graphs for drug pair scoring

Sep 19, 2022

Paul Scherer, Pietro Liò, Mateja Jamnik

Figure 1 for Distributed representations of graphs for drug pair scoring

Figure 2 for Distributed representations of graphs for drug pair scoring

Figure 3 for Distributed representations of graphs for drug pair scoring

Figure 4 for Distributed representations of graphs for drug pair scoring

Abstract:In this paper we study the practicality and usefulness of incorporating distributed representations of graphs into models within the context of drug pair scoring. We argue that the real world growth and update cycles of drug pair scoring datasets subvert the limitations of transductive learning associated with distributed representations. Furthermore, we argue that the vocabulary of discrete substructure patterns induced over drug sets is not dramatically large due to the limited set of atom types and constraints on bonding patterns enforced by chemistry. Under this pretext, we explore the effectiveness of distributed representations of the molecular graphs of drugs in drug pair scoring tasks such as drug synergy, polypharmacy, and drug-drug interaction prediction. To achieve this, we present a methodology for learning and incorporating distributed representations of graphs within a unified framework for drug pair scoring. Subsequently, we augment a number of recent and state-of-the-art models to utilise our embeddings. We empirically show that the incorporation of these embeddings improves downstream performance of almost every model across different drug pair scoring tasks, even those the original model was not designed for. We publicly release all of our drug embeddings for the DrugCombDB, DrugComb, DrugbankDDI, and TwoSides datasets.

* 9 main pages, 6 pages reference and appendix

Via

Access Paper or Ask Questions

Concept Embedding Models

Sep 19, 2022

Mateo Espinosa Zarlenga, Pietro Barbiero, Gabriele Ciravegna, Giuseppe Marra, Francesco Giannini, Michelangelo Diligenti, Zohreh Shams, Frederic Precioso, Stefano Melacci, Adrian Weller(+2 more)

Abstract:Deploying AI-powered systems requires trustworthy models supporting effective human interactions, going beyond raw prediction accuracy. Concept bottleneck models promote trustworthiness by conditioning classification tasks on an intermediate level of human-like concepts. This enables human interventions which can correct mispredicted concepts to improve the model's performance. However, existing concept bottleneck models are unable to find optimal compromises between high task accuracy, robust concept-based explanations, and effective interventions on concepts -- particularly in real-world conditions where complete and accurate concept supervisions are scarce. To address this, we propose Concept Embedding Models, a novel family of concept bottleneck models which goes beyond the current accuracy-vs-interpretability trade-off by learning interpretable high-dimensional concept representations. Our experiments demonstrate that Concept Embedding Models (1) attain better or competitive task accuracy w.r.t. standard neural models without concepts, (2) provide concept representations capturing meaningful semantics including and beyond their ground truth labels, (3) support test-time concept interventions whose effect in test accuracy surpasses that in standard concept bottleneck models, and (4) scale to real-world conditions where complete concept supervisions are scarce.

* To appear at NeurIPS 2022

Via

Access Paper or Ask Questions

Encoding Concepts in Graph Neural Networks

Aug 07, 2022

Lucie Charlotte Magister, Pietro Barbiero, Dmitry Kazhdan, Federico Siciliano, Gabriele Ciravegna, Fabrizio Silvestri, Mateja Jamnik, Pietro Lio

Figure 1 for Encoding Concepts in Graph Neural Networks

Figure 2 for Encoding Concepts in Graph Neural Networks

Figure 3 for Encoding Concepts in Graph Neural Networks

Figure 4 for Encoding Concepts in Graph Neural Networks

Abstract:The opaque reasoning of Graph Neural Networks induces a lack of human trust. Existing graph network explainers attempt to address this issue by providing post-hoc explanations, however, they fail to make the model itself more interpretable. To fill this gap, we introduce the Concept Encoder Module, the first differentiable concept-discovery approach for graph networks. The proposed approach makes graph networks explainable by design by first discovering graph concepts and then using these to solve the task. Our results demonstrate that this approach allows graph networks to: (i) attain model accuracy comparable with their equivalent vanilla versions, (ii) discover meaningful concepts that achieve high concept completeness and purity scores, (iii) provide high-quality concept-based logic explanations for their prediction, and (iv) support effective interventions at test time: these can increase human trust as well as significantly improve model performance.

Via

Access Paper or Ask Questions

Representational Systems Theory: A Unified Approach to Encoding, Analysing and Transforming Representations

Jun 07, 2022

Daniel Raggi, Gem Stapleton, Mateja Jamnik, Aaron Stockdill, Grecia Garcia Garcia, Peter C-H. Cheng

Abstract:The study of representations is of fundamental importance to any form of communication, and our ability to exploit them effectively is paramount. This article presents a novel theory -- Representational Systems Theory -- that is designed to abstractly encode a wide variety of representations from three core perspectives: syntax, entailment, and their properties. By introducing the concept of a construction space, we are able to encode each of these core components under a single, unifying paradigm. Using our Representational Systems Theory, it becomes possible to structurally transform representations in one system into representations in another. An intrinsic facet of our structural transformation technique is representation selection based on properties that representations possess, such as their relative cognitive effectiveness or structural complexity. A major theoretical barrier to providing general structural transformation techniques is a lack of terminating algorithms. Representational Systems Theory permits the derivation of partial transformations when no terminating algorithm can produce a full transformation. Since Representational Systems Theory provides a universal approach to encoding representational systems, a further key barrier is eliminated: the need to devise system-specific structural transformation algorithms, that are necessary when different systems adopt different formalisation approaches. Consequently, Representational Systems Theory is the first general framework that provides a unified approach to encoding representations, supports representation selection via structural transformations, and has the potential for widespread practical application.

* 118 pages total: 94 of main paper + 2 of references + 22 of appendices. Submitted to JACM. Authors Gem Stapleton and Daniel Raggi contributed equally to this research

Via

Access Paper or Ask Questions

Autoformalization with Large Language Models

May 25, 2022

Yuhuai Wu, Albert Q. Jiang, Wenda Li, Markus N. Rabe, Charles Staats, Mateja Jamnik, Christian Szegedy

Figure 1 for Autoformalization with Large Language Models

Figure 2 for Autoformalization with Large Language Models

Figure 3 for Autoformalization with Large Language Models

Figure 4 for Autoformalization with Large Language Models

Abstract:Autoformalization is the process of automatically translating from natural language mathematics to formal specifications and proofs. A successful autoformalization system could advance the fields of formal verification, program synthesis, and artificial intelligence. While the long-term goal of autoformalization seemed elusive for a long time, we show large language models provide new prospects towards this goal. We make the surprising observation that LLMs can correctly translate a significant portion ($25.3\%$) of mathematical competition problems perfectly to formal specifications in Isabelle/HOL. We demonstrate the usefulness of this process by improving a previously introduced neural theorem prover via training on these autoformalized theorems. Our methodology results in a new state-of-the-art result on the MiniF2F theorem proving benchmark, improving the proof rate from $29.6\%$ to $35.2\%$.

* 44 pages

Via

Access Paper or Ask Questions

Thor: Wielding Hammers to Integrate Language Models and Automated Theorem Provers

May 22, 2022

Albert Q. Jiang, Wenda Li, Szymon Tworkowski, Konrad Czechowski, Tomasz Odrzygóźdź, Piotr Miłoś, Yuhuai Wu, Mateja Jamnik

Figure 1 for Thor: Wielding Hammers to Integrate Language Models and Automated Theorem Provers

Figure 2 for Thor: Wielding Hammers to Integrate Language Models and Automated Theorem Provers

Figure 3 for Thor: Wielding Hammers to Integrate Language Models and Automated Theorem Provers

Figure 4 for Thor: Wielding Hammers to Integrate Language Models and Automated Theorem Provers

Abstract:In theorem proving, the task of selecting useful premises from a large library to unlock the proof of a given conjecture is crucially important. This presents a challenge for all theorem provers, especially the ones based on language models, due to their relative inability to reason over huge volumes of premises in text form. This paper introduces Thor, a framework integrating language models and automated theorem provers to overcome this difficulty. In Thor, a class of methods called hammers that leverage the power of automated theorem provers are used for premise selection, while all other tasks are designated to language models. Thor increases a language model's success rate on the PISA dataset from $39\%$ to $57\%$, while solving $8.2\%$ of problems neither language models nor automated theorem provers are able to solve on their own. Furthermore, with a significantly smaller computational budget, Thor can achieve a success rate on the MiniF2F dataset that is on par with the best existing methods. Thor can be instantiated for the majority of popular interactive theorem provers via a straightforward protocol we provide.

Via

Access Paper or Ask Questions

Efficient Decompositional Rule Extraction for Deep Neural Networks

Nov 24, 2021

Mateo Espinosa Zarlenga, Zohreh Shams, Mateja Jamnik

Figure 1 for Efficient Decompositional Rule Extraction for Deep Neural Networks

Figure 2 for Efficient Decompositional Rule Extraction for Deep Neural Networks

Figure 3 for Efficient Decompositional Rule Extraction for Deep Neural Networks

Figure 4 for Efficient Decompositional Rule Extraction for Deep Neural Networks

Abstract:In recent years, there has been significant work on increasing both interpretability and debuggability of a Deep Neural Network (DNN) by extracting a rule-based model that approximates its decision boundary. Nevertheless, current DNN rule extraction methods that consider a DNN's latent space when extracting rules, known as decompositional algorithms, are either restricted to single-layer DNNs or intractable as the size of the DNN or data grows. In this paper, we address these limitations by introducing ECLAIRE, a novel polynomial-time rule extraction algorithm capable of scaling to both large DNN architectures and large training datasets. We evaluate ECLAIRE on a wide variety of tasks, ranging from breast cancer prognosis to particle detection, and show that it consistently extracts more accurate and comprehensible rule sets than the current state-of-the-art methods while using orders of magnitude less computational resources. We make all of our methods available, including a rule set visualisation interface, through the open-source REMIX library (https://github.com/mateoespinosa/remix).

* Accepted at NeurIPS 2021 Workshop on eXplainable AI approaches for debugging and diagnosis (XAI4Debugging)

Via

Access Paper or Ask Questions

Do Concept Bottleneck Models Learn as Intended?

May 10, 2021

Andrei Margeloiu, Matthew Ashman, Umang Bhatt, Yanzhi Chen, Mateja Jamnik, Adrian Weller

Figure 1 for Do Concept Bottleneck Models Learn as Intended?

Figure 2 for Do Concept Bottleneck Models Learn as Intended?

Figure 3 for Do Concept Bottleneck Models Learn as Intended?

Figure 4 for Do Concept Bottleneck Models Learn as Intended?

Abstract:Concept bottleneck models map from raw inputs to concepts, and then from concepts to targets. Such models aim to incorporate pre-specified, high-level concepts into the learning procedure, and have been motivated to meet three desiderata: interpretability, predictability, and intervenability. However, we find that concept bottleneck models struggle to meet these goals. Using post hoc interpretability methods, we demonstrate that concepts do not correspond to anything semantically meaningful in input space, thus calling into question the usefulness of concept bottleneck models in their current form.

* Accepted at ICLR 2021 Workshop on Responsible AI

Via

Access Paper or Ask Questions

Failing Conceptually: Concept-Based Explanations of Dataset Shift

May 01, 2021

Maleakhi A. Wijaya, Dmitry Kazhdan, Botty Dimanov, Mateja Jamnik

Figure 1 for Failing Conceptually: Concept-Based Explanations of Dataset Shift

Figure 2 for Failing Conceptually: Concept-Based Explanations of Dataset Shift

Figure 3 for Failing Conceptually: Concept-Based Explanations of Dataset Shift

Figure 4 for Failing Conceptually: Concept-Based Explanations of Dataset Shift

Abstract:Despite their remarkable performance on a wide range of visual tasks, machine learning technologies often succumb to data distribution shifts. Consequently, a range of recent work explores techniques for detecting these shifts. Unfortunately, current techniques offer no explanations about what triggers the detection of shifts, thus limiting their utility to provide actionable insights. In this work, we present Concept Bottleneck Shift Detection (CBSD): a novel explainable shift detection method. CBSD provides explanations by identifying and ranking the degree to which high-level human-understandable concepts are affected by shifts. Using two case studies (dSprites and 3dshapes), we demonstrate how CBSD can accurately detect underlying concepts that are affected by shifts and achieve higher detection accuracy compared to state-of-the-art shift detection methods.

* ICLR 2021 Workshop (RobustML), 16 pages, 14 figures; typos corrected

Via

Access Paper or Ask Questions

Is Disentanglement all you need? Comparing Concept-based & Disentanglement Approaches

Apr 14, 2021

Dmitry Kazhdan, Botty Dimanov, Helena Andres Terre, Mateja Jamnik, Pietro Liò, Adrian Weller

Figure 1 for Is Disentanglement all you need? Comparing Concept-based & Disentanglement Approaches

Figure 2 for Is Disentanglement all you need? Comparing Concept-based & Disentanglement Approaches

Figure 3 for Is Disentanglement all you need? Comparing Concept-based & Disentanglement Approaches

Figure 4 for Is Disentanglement all you need? Comparing Concept-based & Disentanglement Approaches

Abstract:Concept-based explanations have emerged as a popular way of extracting human-interpretable representations from deep discriminative models. At the same time, the disentanglement learning literature has focused on extracting similar representations in an unsupervised or weakly-supervised way, using deep generative models. Despite the overlapping goals and potential synergies, to our knowledge, there has not yet been a systematic comparison of the limitations and trade-offs between concept-based explanations and disentanglement approaches. In this paper, we give an overview of these fields, comparing and contrasting their properties and behaviours on a diverse set of tasks, and highlighting their potential strengths and limitations. In particular, we demonstrate that state-of-the-art approaches from both classes can be data inefficient, sensitive to the specific nature of the classification/regression task, or sensitive to the employed concept representation.

* Presented at the RAI, WeaSul, and RobustML workshops at The Ninth International Conference on Learning Representations (ICLR) 2021

Via

Access Paper or Ask Questions