Counterfactual explanation is a form of interpretable machine learning that generates perturbations on a sample to achieve the desired outcome. The generated samples can act as instructions to guide end users on how to observe the desired results by altering samples. Although state-of-the-art counterfactual explanation methods are proposed to use variational autoencoder (VAE) to achieve promising improvements, they suffer from two major limitations: 1) the counterfactuals generation is prohibitively slow, which prevents algorithms from being deployed in interactive environments; 2) the counterfactual explanation algorithms produce unstable results due to the randomness in the sampling procedure of variational autoencoder. In this work, to address the above limitations, we design a robust and efficient counterfactual explanation framework, namely CeFlow, which utilizes normalizing flows for the mixed-type of continuous and categorical features. Numerical experiments demonstrate that our technique compares favorably to state-of-the-art methods. We release our source at https://github.com/tridungduong16/fairCE.git for reproducing the results.
Counterfactual fairness alleviates the discrimination between the model prediction toward an individual in the actual world (observational data) and that in counterfactual world (i.e., what if the individual belongs to other sensitive groups). The existing studies need to pre-define the structural causal model that captures the correlations among variables for counterfactual inference; however, the underlying causal model is usually unknown and difficult to be validated in real-world scenarios. Moreover, the misspecification of the causal model potentially leads to poor performance in model prediction and thus makes unfair decisions. In this research, we propose a novel minimax game-theoretic model for counterfactual fairness that can produce accurate results meanwhile achieve a counterfactually fair decision with the relaxation of strong assumptions of structural causal models. In addition, we also theoretically prove the error bound of the proposed minimax model. Empirical experiments on multiple real-world datasets illustrate our superior performance in both accuracy and fairness. Source code is available at \url{https://github.com/tridungduong16/counterfactual_fairness_game_theoretic}.
Increasing number of COVID-19 research literatures cause new challenges in effective literature screening and COVID-19 domain knowledge aware Information Retrieval. To tackle the challenges, we demonstrate two tasks along withsolutions, COVID-19 literature retrieval, and question answering. COVID-19 literature retrieval task screens matching COVID-19 literature documents for textual user query, and COVID-19 question answering task predicts proper text fragments from text corpus as the answer of specific COVID-19 related questions. Based on transformer neural network, we provided solutions to implement the tasks on CORD-19 dataset, we display some examples to show the effectiveness of our proposed solutions.
The rapid advances in automation technologies, such as artificial intelligence (AI) and robotics, pose an increasing risk of automation for occupations, with a likely significant impact on the labour market. Recent social-economic studies suggest that nearly 50\% of occupations are at high risk of being automated in the next decade. However, the lack of granular data and empirically informed models have limited the accuracy of these studies and made it challenging to predict which jobs will be automated. In this paper, we study the automation risk of occupations by performing a classification task between automated and non-automated occupations. The available information is 910 occupations' task statements, skills and interactions categorised by Standard Occupational Classification (SOC). To fully utilize this information, we propose a graph-based semi-supervised classification method named \textbf{A}utomated \textbf{O}ccupation \textbf{C}lassification based on \textbf{G}raph \textbf{C}onvolutional \textbf{N}etworks (\textbf{AOC-GCN}) to identify the automated risk for occupations. This model integrates a heterogeneous graph to capture occupations' local and global contexts. The results show that our proposed method outperforms the baseline models by considering the information of both internal features of occupations and their external interactions. This study could help policymakers identify potential automated occupations and support individuals' decision-making before entering the job market.
Graph learning models are critical tools for researchers to explore graph-structured data. To train a capable graph learning model, a conventional method uses sufficient training data to train a graph model on a single device. However, it is prohibitive to do so in real-world scenarios due to privacy concerns. Federated learning provides a feasible solution to address such limitations via introducing various privacy-preserving mechanisms, such as differential privacy on graph edges. Nevertheless, differential privacy in federated graph learning secures the classified information maintained in graphs. It degrades the performances of the graph learning models. In this paper, we investigate how to implement differential privacy on graph edges and observe the performances decreasing in the experiments. We also note that the differential privacy on graph edges introduces noises to perturb graph proximity, which is one of the graph augmentations in graph contrastive learning. Inspired by that, we propose to leverage the advantages of graph contrastive learning to alleviate the performance dropping caused by differential privacy. Extensive experiments are conducted with several representative graph models and widely-used datasets, showing that contrastive learning indeed alleviates the models' performance dropping caused by differential privacy.
Counterfactual explanations interpret the recommendation mechanism via exploring how minimal alterations on items or users affect the recommendation decisions. Existing counterfactual explainable approaches face huge search space and their explanations are either action-based (e.g., user click) or aspect-based (i.e., item description). We believe item attribute-based explanations are more intuitive and persuadable for users since they explain by fine-grained item demographic features (e.g., brand). Moreover, counterfactual explanation could enhance recommendations by filtering out negative items. In this work, we propose a novel Counterfactual Explainable Recommendation (CERec) to generate item attribute-based counterfactual explanations meanwhile to boost recommendation performance. Our CERec optimizes an explanation policy upon uniformly searching candidate counterfactuals within a reinforcement learning environment. We reduce the huge search space with an adaptive path sampler by using rich context information of a given knowledge graph. We also deploy the explanation policy to a recommendation model to enhance the recommendation. Extensive explainability and recommendation evaluations demonstrate CERec's ability to provide explanations consistent with user preferences and maintain improved recommendations. We release our code at https://github.com/Chrystalii/CERec.
Graph contrastive learning has emerged as a powerful tool for unsupervised graph representation learning. The key to the success of graph contrastive learning is to acquire high-quality positive and negative samples as contrasting pairs for the purpose of learning underlying structural semantics of the input graph. Recent works usually sample negative samples from the same training batch with the positive samples, or from an external irrelevant graph. However, a significant limitation lies in such strategies, which is the unavoidable problem of sampling false negative samples. In this paper, we propose a novel method to utilize \textbf{C}ounterfactual mechanism to generate artificial hard negative samples for \textbf{G}raph \textbf{C}ontrastive learning, namely \textbf{CGC}, which has a different perspective compared to those sampling-based strategies. We utilize counterfactual mechanism to produce hard negative samples, which ensures that the generated samples are similar to, but have labels that different from the positive sample. The proposed method achieves satisfying results on several datasets compared to some traditional unsupervised graph learning methods and some SOTA graph contrastive learning methods. We also conduct some supplementary experiments to give an extensive illustration of the proposed method, including the performances of CGC with different hard negative samples and evaluations for hard negative samples generated with different similarity measurements.
Modern recommender systems operate in a fully server-based fashion. To cater to millions of users, the frequent model maintaining and the high-speed processing for concurrent user requests are required, which comes at the cost of a huge carbon footprint. Meanwhile, users need to upload their behavior data even including the immediate environmental context to the server, raising the public concern about privacy. On-device recommender systems circumvent these two issues with cost-conscious settings and local inference. However, due to the limited memory and computing resources, on-device recommender systems are confronted with two fundamental challenges: (1) how to reduce the size of regular models to fit edge devices? (2) how to retain the original capacity? Previous research mostly adopts tensor decomposition techniques to compress the regular recommendation model with limited compression ratio so as to avoid drastic performance degradation. In this paper, we explore ultra-compact models for next-item recommendation, by loosing the constraint of dimensionality consistency in tensor decomposition. Meanwhile, to compensate for the capacity loss caused by compression, we develop a self-supervised knowledge distillation framework which enables the compressed model (student) to distill the essential information lying in the raw data, and improves the long-tail item recommendation through an embedding-recombination strategy with the original model (teacher). The extensive experiments on two benchmarks demonstrate that, with 30x model size reduction, the compressed model almost comes with no accuracy loss, and even outperforms its uncompressed counterpart in most cases.
Medication recommendation is a crucial task for intelligent healthcare systems. Previous studies mainly recommend medications with electronic health records(EHRs). However, some details of interactions between doctors and patients may be ignored in EHRs, which are essential for automatic medication recommendation. Therefore, we make the first attempt to recommend medications with the conversations between doctors and patients. In this work, we construct DialMed, the first high-quality dataset for medical dialogue-based medication recommendation task. It contains 11,996 medical dialogues related to 16 common diseases from 3 departments and 70 corresponding common medications. Furthermore, we propose a Dialogue structure and Disease knowledge aware Network(DDN), where a graph attention network is utilized to model the dialogue structure and the knowledge graph is used to introduce external disease knowledge. The extensive experimental results demonstrate that the proposed method is a promising solution to recommend medications with medical dialogues. The dataset and code are available at https://github.com/Hhhhhhhzf/DialMed.
Transformers have achieved state-of-the-art performance in learning graph representations. However, there are still some challenges when applying transformers to real-world scenarios due to the fact that deep transformers are hard to be trained from scratch and the memory consumption is large. To address the two challenges, we propose Graph Masked Autoencoders (GMAE), a self-supervised model for learning graph representations, where vanilla graph transformers are used as the encoder and the decoder. GMAE takes partially masked graphs as input, and reconstructs the features of the masked nodes. We adopt asymmetric encoder-decoder design, where the encoder is a deep graph transformer and the decoder is a shallow graph transformer. The masking mechanism and the asymmetric design make GMAE a memory-efficient model compared with conventional transformers. We show that, compared with training from scratch, the graph transformer pre-trained using GMAE can achieve much better performance after fine-tuning. We also show that, when serving as a conventional self-supervised graph representation model and using an SVM model as the downstream graph classifier, GMAE achieves state-of-the-art performance on 5 of the 7 benchmark datasets.