Deep learning methods are highly accurate, yet their opaque decision process prevents them from earning full human trust. Concept-based models aim to address this issue by learning tasks based on a set of human-understandable concepts. However, state-of-the-art concept-based models rely on high-dimensional concept embedding representations which lack a clear semantic meaning, thus questioning the interpretability of their decision process. To overcome this limitation, we propose the Deep Concept Reasoner (DCR), the first interpretable concept-based model that builds upon concept embeddings. In DCR, neural networks do not make task predictions directly, but they build syntactic rule structures using concept embeddings. DCR then executes these rules on meaningful concept truth degrees to provide a final interpretable and semantically-consistent prediction in a differentiable manner. Our experiments show that DCR: (i) improves up to +25% w.r.t. state-of-the-art interpretable concept-based models on challenging benchmarks (ii) discovers meaningful logic rules matching known ground truths even in the absence of concept supervision during training, and (iii), facilitates the generation of counterfactual examples providing the learnt rules as guidance.
Although neural networks (especially deep neural networks) have achieved \textit{better-than-human} performance in many fields, their real-world deployment is still questionable due to the lack of awareness about the limitation in their knowledge. To incorporate such awareness in the machine learning model, prediction with reject option (also known as selective classification or classification with abstention) has been proposed in literature. In this paper, we present a systematic review of the prediction with the reject option in the context of various neural networks. To the best of our knowledge, this is the first study focusing on this aspect of neural networks. Moreover, we discuss different novel loss functions related to the reject option and post-training processing (if any) of network output for generating suitable measurements for knowledge awareness of the model. Finally, we address the application of the rejection option in reducing the prediction time for the real-time problems and present a comprehensive summary of the techniques related to the reject option in the context of extensive variety of neural networks. Our code is available on GitHub: \url{https://github.com/MehediHasanTutul/Reject_option}
Message Passing Neural Networks (MPNNs) are instances of Graph Neural Networks that leverage the graph to send messages over the edges. This inductive bias leads to a phenomenon known as over-squashing, where a node feature is insensitive to information contained at distant nodes. Despite recent methods introduced to mitigate this issue, an understanding of the causes for over-squashing and of possible solutions are lacking. In this theoretical work, we prove that: (i) Neural network width can mitigate over-squashing, but at the cost of making the whole network more sensitive; (ii) Conversely, depth cannot help mitigate over-squashing: increasing the number of layers leads to over-squashing being dominated by vanishing gradients; (iii) The graph topology plays the greatest role, since over-squashing occurs between nodes at high commute (access) time. Our analysis provides a unified framework to study different recent methods introduced to cope with over-squashing and serves as a justification for a class of methods that fall under `graph rewiring'.
Genome-wide studies leveraging recent high-throughput sequencing technologies collect high-dimensional data. However, they usually include small cohorts of patients, and the resulting tabular datasets suffer from the "curse of dimensionality". Training neural networks on such datasets is typically unstable, and the models overfit. One problem is that modern weight initialisation strategies make simplistic assumptions unsuitable for small-size datasets. We propose Graph-Conditioned MLP, a novel method to introduce priors on the parameters of an MLP. Instead of randomly initialising the first layer, we condition it directly on the training data. More specifically, we create a graph for each feature in the dataset (e.g., a gene), where each node represents a sample from the same dataset (e.g., a patient). We then use Graph Neural Networks (GNNs) to learn embeddings from these graphs and use the embeddings to initialise the MLP's parameters. Our approach opens the prospect of introducing additional biological knowledge when constructing the graphs. We present early results on 7 classification tasks from gene expression data and show that GC-MLP outperforms an MLP.
Combining different modalities of data from human tissues has been critical in advancing biomedical research and personalised medical care. In this study, we leverage a graph embedding model (i.e VGAE) to perform link prediction on tissue-specific Gene-Gene Interaction (GGI) networks. Through ablation experiments, we prove that the combination of multiple biological modalities (i.e multi-omics) leads to powerful embeddings and better link prediction performances. Our evaluation shows that the integration of gene methylation profiles and RNA-sequencing data significantly improves the link prediction performance. Overall, the combination of RNA-sequencing and gene methylation data leads to a link prediction accuracy of 71% on GGI networks. By harnessing graph representation learning on multi-omics data, our work brings novel insights to the current literature on multi-omics integration in bioinformatics.
Single image super-resolution (SISR) aims to obtain a high-resolution output from one low-resolution image. Currently, deep learning-based SISR approaches have been widely discussed in medical image processing, because of their potential to achieve high-quality, high spatial resolution images without the cost of additional scans. However, most existing methods are designed for scale-specific SR tasks and are unable to generalise over magnification scales. In this paper, we propose an approach for medical image arbitrary-scale super-resolution (MIASSR), in which we couple meta-learning with generative adversarial networks (GANs) to super-resolve medical images at any scale of magnification in (1, 4]. Compared to state-of-the-art SISR algorithms on single-modal magnetic resonance (MR) brain images (OASIS-brains) and multi-modal MR brain images (BraTS), MIASSR achieves comparable fidelity performance and the best perceptual quality with the smallest model size. We also employ transfer learning to enable MIASSR to tackle SR tasks of new medical modalities, such as cardiac MR images (ACDC) and chest computed tomography images (COVID-CT). The source code of our work is also public. Thus, MIASSR has the potential to become a new foundational pre-/post-processing step in clinical image analysis tasks such as reconstruction, image quality enhancement, and segmentation.
Recent attempts at Super-Resolution for medical images used deep learning techniques such as Generative Adversarial Networks (GANs) to achieve perceptually realistic single image Super-Resolution. Yet, they are constrained by their inability to generalise to different scale factors. This involves high storage and energy costs as every integer scale factor involves a separate neural network. A recent paper has proposed a novel meta-learning technique that uses a Weight Prediction Network to enable Super-Resolution on arbitrary scale factors using only a single neural network. In this paper, we propose a new network that combines that technique with SRGAN, a state-of-the-art GAN-based architecture, to achieve arbitrary scale, high fidelity Super-Resolution for medical images. By using this network to perform arbitrary scale magnifications on images from the Multimodal Brain Tumor Segmentation Challenge (BraTS) dataset, we demonstrate that it is able to outperform traditional interpolation methods by up to 20$\%$ on SSIM scores whilst retaining generalisability on brain MRI images. We show that performance across scales is not compromised, and that it is able to achieve competitive results with other state-of-the-art methods such as EDSR whilst being fifty times smaller than them. Combining efficiency, performance, and generalisability, this can hopefully become a new foundation for tackling Super-Resolution on medical images.
In this work, a machine learning approach for identifying the multi-omics metabolic regulatory control circuits inside the pathways is described. Therefore, the identification of bacterial metabolic pathways that are more regulated than others in term of their multi-omics follows from the analysis of these circuits . This is a consequence of the alternation of the omic values of codon usage and protein abundance along with the circuits. In this work, the E.Coli's Glycolysis and its multi-omic circuit features are shown as an example.
Disease-gene prediction (DGP) refers to the computational challenge of predicting associations between genes and diseases. Effective solutions to the DGP problem have the potential to accelerate the therapeutic development pipeline at early stages via efficient prioritization of candidate genes for various diseases. In this work, we introduce the variational graph auto-encoder (VGAE) as a promising unsupervised approach for learning powerful latent embeddings in disease-gene networks that can be used for the DGP problem, the first approach using a generative model involving graph neural networks (GNNs). In addition to introducing the VGAE as a promising approach to the DGP problem, we further propose an extension (constrained-VGAE or C-VGAE) which adapts the learning algorithm for link prediction between two distinct node types in heterogeneous graphs. We evaluate and demonstrate the effectiveness of the VGAE on general link prediction in a disease-gene association network and the C-VGAE on disease-gene prediction in the same network, using popular random walk driven methods as baselines. While the methodology presented demonstrates potential solely based on utilizing the topology of a disease-gene association network, it can be further enhanced and explored through the integration of additional biological networks such as gene/protein interaction networks and additional biological features pertaining to the diseases and genes represented in the disease-gene association network.
Reinforcement Learning (RL) has recently achieved tremendous success due to the partnership with Deep Neural Networks (DNNs). Genetic Algorithms (GAs), often seen as a competing approach to RL, have run out of favour due to their inability to scale up to the DNNs required to solve the most complex environments. Contrary to this dichotomic view, in the physical world, evolution and learning are complementary processes that continuously interact. The recently proposed Evolutionary Reinforcement Learning (ERL) framework has demonstrated the capacity of the two methods to enhance each other. However, ERL has not fully addressed the scalability problem of GAs. In this paper, we argue that this problem is rooted in an unfortunate combination of a simple genetic encoding for DNNs and the use of traditional biologically-inspired variation operators. When applied to these encodings, the standard operators are destructive and cause catastrophic forgetting of the traits the networks acquired. We propose a novel algorithm called Proximal Distilled Evolutionary Reinforcement Learning (PDERL) that is characterised by a hierarchical integration between evolution and learning. The main innovation of PDERL is the use of learning-based variation operators that compensate for the simplicity of the genetic representation. Unlike the traditional operators, the ones we propose meet their functional requirements. We evaluate PDERL in five robot locomotion environments from the OpenAI gym. Our method outperforms ERL, as well as two state of the art RL algorithms, PPO and TD3, in all the environments.