Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jian Pei

Revealing Unfair Models by Mining Interpretable Evidence

Jul 12, 2022
Mohit Bajaj, Lingyang Chu, Vittorio Romaniello, Gursimran Singh, Jian Pei, Zirui Zhou, Lanjun Wang, Yong Zhang

Figure 1 for Revealing Unfair Models by Mining Interpretable Evidence

Figure 2 for Revealing Unfair Models by Mining Interpretable Evidence

Figure 3 for Revealing Unfair Models by Mining Interpretable Evidence

Figure 4 for Revealing Unfair Models by Mining Interpretable Evidence

The popularity of machine learning has increased the risk of unfair models getting deployed in high-stake applications, such as justice system, drug/vaccination design, and medical diagnosis. Although there are effective methods to train fair models from scratch, how to automatically reveal and explain the unfairness of a trained model remains a challenging task. Revealing unfairness of machine learning models in interpretable fashion is a critical step towards fair and trustworthy AI. In this paper, we systematically tackle the novel task of revealing unfair models by mining interpretable evidence (RUMIE). The key idea is to find solid evidence in the form of a group of data instances discriminated most by the model. To make the evidence interpretable, we also find a set of human-understandable key attributes and decision rules that characterize the discriminated data instances and distinguish them from the other non-discriminated data. As demonstrated by extensive experiments on many real-world data sets, our method finds highly interpretable and solid evidence to effectively reveal the unfairness of trained models. Moreover, it is much more scalable than all of the baseline methods.

Via

Access Paper or Ask Questions

Bridging the Gap Between Indexing and Retrieval for Differentiable Search Index with Query Generation

Jun 21, 2022
Shengyao Zhuang, Houxing Ren, Linjun Shou, Jian Pei, Ming Gong, Guido Zuccon, Daxin Jiang

Figure 1 for Bridging the Gap Between Indexing and Retrieval for Differentiable Search Index with Query Generation

Figure 2 for Bridging the Gap Between Indexing and Retrieval for Differentiable Search Index with Query Generation

Figure 3 for Bridging the Gap Between Indexing and Retrieval for Differentiable Search Index with Query Generation

Figure 4 for Bridging the Gap Between Indexing and Retrieval for Differentiable Search Index with Query Generation

The Differentiable Search Index (DSI) is a new, emerging paradigm for information retrieval. Unlike traditional retrieval architectures where index and retrieval are two different and separate components, DSI uses a single transformer model to perform both indexing and retrieval. In this paper, we identify and tackle an important issue of current DSI models: the data distribution mismatch that occurs between the DSI indexing and retrieval processes. Specifically, we argue that, at indexing, current DSI methods learn to build connections between long document texts and their identifies, but then at retrieval, short query texts are provided to DSI models to perform the retrieval of the document identifiers. This problem is further exacerbated when using DSI for cross-lingual retrieval, where document text and query text are in different languages. To address this fundamental problem of current DSI models we propose a simple yet effective indexing framework for DSI called DSI-QG. In DSI-QG, documents are represented by a number of relevant queries generated by a query generation model at indexing time. This allows DSI models to connect a document identifier to a set of query texts when indexing, hence mitigating data distribution mismatches present between the indexing and the retrieval phases. Empirical results on popular mono-lingual and cross-lingual passage retrieval benchmark datasets show that DSI-QG significantly outperforms the original DSI model.

* 6 pages

Via

Access Paper or Ask Questions

Communication-Efficient Robust Federated Learning with Noisy Labels

Jun 11, 2022
Junyi Li, Jian Pei, Heng Huang

Figure 1 for Communication-Efficient Robust Federated Learning with Noisy Labels

Figure 2 for Communication-Efficient Robust Federated Learning with Noisy Labels

Figure 3 for Communication-Efficient Robust Federated Learning with Noisy Labels

Federated learning (FL) is a promising privacy-preserving machine learning paradigm over distributed located data. In FL, the data is kept locally by each user. This protects the user privacy, but also makes the server difficult to verify data quality, especially if the data are correctly labeled. Training with corrupted labels is harmful to the federated learning task; however, little attention has been paid to FL in the case of label noise. In this paper, we focus on this problem and propose a learning-based reweighting approach to mitigate the effect of noisy labels in FL. More precisely, we tuned a weight for each training sample such that the learned model has optimal generalization performance over a validation set. More formally, the process can be formulated as a Federated Bilevel Optimization problem. Bilevel optimization problem is a type of optimization problem with two levels of entangled problems. The non-distributed bilevel problems have witnessed notable progress recently with new efficient algorithms. However, solving bilevel optimization problems under the Federated Learning setting is under-investigated. We identify that the high communication cost in hypergradient evaluation is the major bottleneck. So we propose \textit{Comm-FedBiO} to solve the general Federated Bilevel Optimization problems; more specifically, we propose two communication-efficient subroutines to estimate the hypergradient. Convergence analysis of the proposed algorithms is also provided. Finally, we apply the proposed algorithms to solve the noisy label problem. Our approach has shown superior performance on several real-world datasets compared to various baselines.

* To appear in KDD 2022

Via

Access Paper or Ask Questions

Multi-Behavior Sequential Recommendation with Temporal Graph Transformer

Jun 06, 2022
Lianghao Xia, Chao Huang, Yong Xu, Jian Pei

Figure 1 for Multi-Behavior Sequential Recommendation with Temporal Graph Transformer

Figure 2 for Multi-Behavior Sequential Recommendation with Temporal Graph Transformer

Figure 3 for Multi-Behavior Sequential Recommendation with Temporal Graph Transformer

Figure 4 for Multi-Behavior Sequential Recommendation with Temporal Graph Transformer

Modeling time-evolving preferences of users with their sequential item interactions, has attracted increasing attention in many online applications. Hence, sequential recommender systems have been developed to learn the dynamic user interests from the historical interactions for suggesting items. However, the interaction pattern encoding functions in most existing sequential recommender systems have focused on single type of user-item interactions. In many real-life online platforms, user-item interactive behaviors are often multi-typed (e.g., click, add-to-favorite, purchase) with complex cross-type behavior inter-dependencies. Learning from informative representations of users and items based on their multi-typed interaction data, is of great importance to accurately characterize the time-evolving user preference. In this work, we tackle the dynamic user-item relation learning with the awareness of multi-behavior interactive patterns. Towards this end, we propose a new Temporal Graph Transformer (TGT) recommendation framework to jointly capture dynamic short-term and long-range user-item interactive patterns, by exploring the evolving correlations across different types of behaviors. The new TGT method endows the sequential recommendation architecture to distill dedicated knowledge for type-specific behavior relational context and the implicit behavior dependencies. Experiments on the real-world datasets indicate that our method TGT consistently outperforms various state-of-the-art recommendation methods. Our model implementation codes are available at https://github.com/akaxlh/TGT.

* This paper has been published as a research paper at TKDE 2022

Via

Access Paper or Ask Questions

Trustworthy Graph Neural Networks: Aspects, Methods and Trends

May 16, 2022
He Zhang, Bang Wu, Xingliang Yuan, Shirui Pan, Hanghang Tong, Jian Pei

Figure 1 for Trustworthy Graph Neural Networks: Aspects, Methods and Trends

Figure 2 for Trustworthy Graph Neural Networks: Aspects, Methods and Trends

Figure 3 for Trustworthy Graph Neural Networks: Aspects, Methods and Trends

Figure 4 for Trustworthy Graph Neural Networks: Aspects, Methods and Trends

Graph neural networks (GNNs) have emerged as a series of competent graph learning methods for diverse real-world scenarios, ranging from daily applications like recommendation systems and question answering to cutting-edge technologies such as drug discovery in life sciences and n-body simulation in astrophysics. However, task performance is not the only requirement for GNNs. Performance-oriented GNNs have exhibited potential adverse effects like vulnerability to adversarial attacks, unexplainable discrimination against disadvantaged groups, or excessive resource consumption in edge computing environments. To avoid these unintentional harms, it is necessary to build competent GNNs characterised by trustworthiness. To this end, we propose a comprehensive roadmap to build trustworthy GNNs from the view of the various computing technologies involved. In this survey, we introduce basic concepts and comprehensively summarise existing efforts for trustworthy GNNs from six aspects, including robustness, explainability, privacy, fairness, accountability, and environmental well-being. Additionally, we highlight the intricate cross-aspect relations between the above six aspects of trustworthy GNNs. Finally, we present a thorough overview of trending directions for facilitating the research and industrialisation of trustworthy GNNs.

* 36 pages, 7 tables, 4 figures

Via

Access Paper or Ask Questions

Multi-level Contrastive Learning for Cross-lingual Spoken Language Understanding

May 07, 2022
Shining Liang, Linjun Shou, Jian Pei, Ming Gong, Wanli Zuo, Xianglin Zuo, Daxin Jiang

Figure 1 for Multi-level Contrastive Learning for Cross-lingual Spoken Language Understanding

Figure 2 for Multi-level Contrastive Learning for Cross-lingual Spoken Language Understanding

Figure 3 for Multi-level Contrastive Learning for Cross-lingual Spoken Language Understanding

Figure 4 for Multi-level Contrastive Learning for Cross-lingual Spoken Language Understanding

Although spoken language understanding (SLU) has achieved great success in high-resource languages, such as English, it remains challenging in low-resource languages mainly due to the lack of high quality training data. The recent multilingual code-switching approach samples some words in an input utterance and replaces them by expressions in some other languages of the same meaning. The multilingual code-switching approach achieves better alignments of representations across languages in zero-shot cross-lingual SLU. Surprisingly, all existing multilingual code-switching methods disregard the inherent semantic structure in SLU, i.e., most utterances contain one or more slots, and each slot consists of one or more words. In this paper, we propose to exploit the "utterance-slot-word" structure of SLU and systematically model this structure by a multi-level contrastive learning framework at the utterance, slot, and word levels. We develop novel code-switching schemes to generate hard negative examples for contrastive learning at all levels. Furthermore, we develop a label-aware joint model to leverage label semantics for cross-lingual knowledge transfer. Our experimental results show that our proposed methods significantly improve the performance compared with the strong baselines on two zero-shot cross-lingual SLU benchmark datasets.

Via

Access Paper or Ask Questions

Spatial-Temporal Hypergraph Self-Supervised Learning for Crime Prediction

Apr 18, 2022
Zhonghang Li, Chao Huang, Lianghao Xia, Yong Xu, Jian Pei

Figure 1 for Spatial-Temporal Hypergraph Self-Supervised Learning for Crime Prediction

Figure 2 for Spatial-Temporal Hypergraph Self-Supervised Learning for Crime Prediction

Figure 3 for Spatial-Temporal Hypergraph Self-Supervised Learning for Crime Prediction

Figure 4 for Spatial-Temporal Hypergraph Self-Supervised Learning for Crime Prediction

Crime has become a major concern in many cities, which calls for the rising demand for timely predicting citywide crime occurrence. Accurate crime prediction results are vital for the beforehand decision-making of government to alleviate the increasing concern about the public safety. While many efforts have been devoted to proposing various spatial-temporal forecasting techniques to explore dependence across locations and time periods, most of them follow a supervised learning manner, which limits their spatial-temporal representation ability on sparse crime data. Inspired by the recent success in self-supervised learning, this work proposes a Spatial-Temporal Hypergraph Self-Supervised Learning framework (ST-HSL) to tackle the label scarcity issue in crime prediction. Specifically, we propose the cross-region hypergraph structure learning to encode region-wise crime dependency under the entire urban space. Furthermore, we design the dual-stage self-supervised learning paradigm, to not only jointly capture local- and global-level spatial-temporal crime patterns, but also supplement the sparse crime representation by augmenting region self-discrimination. We perform extensive experiments on two real-life crime datasets. Evaluation results show that our ST-HSL significantly outperforms state-of-the-art baselines. Further analysis provides insights into the superiority of our ST-HSL method in the representation of spatial-temporal crime patterns. The implementation code is available at https://github.com/LZH-YS1998/STHSL.

* This paper has been published as a full paper at ICDE 2022

Via

Access Paper or Ask Questions

Bridging the Gap between Language Models and Cross-Lingual Sequence Labeling

Apr 11, 2022
Nuo Chen, Linjun Shou, Ming Gong, Jian Pei, Daxin Jiang

Figure 1 for Bridging the Gap between Language Models and Cross-Lingual Sequence Labeling

Figure 2 for Bridging the Gap between Language Models and Cross-Lingual Sequence Labeling

Figure 3 for Bridging the Gap between Language Models and Cross-Lingual Sequence Labeling

Figure 4 for Bridging the Gap between Language Models and Cross-Lingual Sequence Labeling

Large-scale cross-lingual pre-trained language models (xPLMs) have shown effectiveness in cross-lingual sequence labeling tasks (xSL), such as cross-lingual machine reading comprehension (xMRC) by transferring knowledge from a high-resource language to low-resource languages. Despite the great success, we draw an empirical observation that there is a training objective gap between pre-training and fine-tuning stages: e.g., mask language modeling objective requires local understanding of the masked token and the span-extraction objective requires global understanding and reasoning of the input passage/paragraph and question, leading to the discrepancy between pre-training and xMRC. In this paper, we first design a pre-training task tailored for xSL named Cross-lingual Language Informative Span Masking (CLISM) to eliminate the objective gap in a self-supervised manner. Second, we present ContrAstive-Consistency Regularization (CACR), which utilizes contrastive learning to encourage the consistency between representations of input parallel sequences via unsupervised cross-lingual instance-wise training signals during pre-training. By these means, our methods not only bridge the gap between pretrain-finetune, but also enhance PLMs to better capture the alignment between different languages. Extensive experiments prove that our method achieves clearly superior results on multiple xSL benchmarks with limited pre-training data. Our methods also surpass the previous state-of-the-art methods by a large margin in few-shot data settings, where only a few hundred training examples are available.

* 15 pages

Via

Access Paper or Ask Questions

Membership Privacy Protection for Image Translation Models via Adversarial Knowledge Distillation

Mar 10, 2022
Saeed Ranjbar Alvar, Lanjun Wang, Jian Pei, Yong Zhang

Figure 1 for Membership Privacy Protection for Image Translation Models via Adversarial Knowledge Distillation

Figure 2 for Membership Privacy Protection for Image Translation Models via Adversarial Knowledge Distillation

Figure 3 for Membership Privacy Protection for Image Translation Models via Adversarial Knowledge Distillation

Figure 4 for Membership Privacy Protection for Image Translation Models via Adversarial Knowledge Distillation

Image-to-image translation models are shown to be vulnerable to the Membership Inference Attack (MIA), in which the adversary's goal is to identify whether a sample is used to train the model or not. With daily increasing applications based on image-to-image translation models, it is crucial to protect the privacy of these models against MIAs. We propose adversarial knowledge distillation (AKD) as a defense method against MIAs for image-to-image translation models. The proposed method protects the privacy of the training samples by improving the generalizability of the model. We conduct experiments on the image-to-image translation models and show that AKD achieves the state-of-the-art utility-privacy tradeoff by reducing the attack performance up to 38.9% compared with the regular training model at the cost of a slight drop in the quality of the generated output images. The experimental results also indicate that the models trained by AKD generalize better than the regular training models. Furthermore, compared with existing defense methods, the results show that at the same privacy protection level, image translation models trained by AKD generate outputs with higher quality; while at the same quality of outputs, AKD enhances the privacy protection over 30%.

Via

Access Paper or Ask Questions

Fair and efficient contribution valuation for vertical federated learning

Jan 07, 2022
Zhenan Fan, Huang Fang, Zirui Zhou, Jian Pei, Michael P. Friedlander, Yong Zhang

Figure 1 for Fair and efficient contribution valuation for vertical federated learning

Figure 2 for Fair and efficient contribution valuation for vertical federated learning

Figure 3 for Fair and efficient contribution valuation for vertical federated learning

Figure 4 for Fair and efficient contribution valuation for vertical federated learning

Federated learning is a popular technology for training machine learning models on distributed data sources without sharing data. Vertical federated learning or feature-based federated learning applies to the cases that different data sources share the same sample ID space but differ in feature space. To ensure the data owners' long-term engagement, it is critical to objectively assess the contribution from each data source and recompense them accordingly. The Shapley value (SV) is a provably fair contribution valuation metric originated from cooperative game theory. However, computing the SV requires extensively retraining the model on each subset of data sources, which causes prohibitively high communication costs in federated learning. We propose a contribution valuation metric called vertical federated Shapley value (VerFedSV) based on SV. We show that VerFedSV not only satisfies many desirable properties for fairness but is also efficient to compute, and can be adapted to both synchronous and asynchronous vertical federated learning algorithms. Both theoretical analysis and extensive experimental results verify the fairness, efficiency, and adaptability of VerFedSV.

Via

Access Paper or Ask Questions