Lending decisions are usually made with proprietary models that provide minimally acceptable explanations to users. In a future world without such secrecy, what decision support tools would one want to use for justified lending decisions? This question is timely, since the economy has dramatically shifted due to a pandemic, and a massive number of new loans will be necessary in the short term. We propose a framework for such decisions, including a globally interpretable machine learning model, an interactive visualization of it, and several types of summaries and explanations for any given decision. The machine learning model is a two-layer additive risk model, which resembles a two-layer neural network, but is decomposable into subscales. In this model, each node in the first (hidden) layer represents a meaningful subscale model, and all of the nonlinearities are transparent. Our online visualization tool allows exploration of this model, showing precisely how it came to its conclusion. We provide three types of explanations that are simpler than, but consistent with, the global model: case-based reasoning explanations that use neighboring past cases, a set of features that were the most important for the model's prediction, and summary-explanations that provide a customized sparse explanation for any particular lending decision made by the model. Our framework earned the FICO recognition award for the Explainable Machine Learning Challenge, which was the first public challenge in the domain of explainable machine learning.
Faceted summarization provides briefings of a document from different perspectives. Readers can quickly comprehend the main points of a long document with the help of a structured outline. However, little research has been conducted on this subject, partially due to the lack of large-scale faceted summarization datasets. In this study, we present FacetSum, a faceted summarization benchmark built on Emerald journal articles, covering a diverse range of domains. Different from traditional document-summary pairs, FacetSum provides multiple summaries, each targeted at specific sections of a long document, including the purpose, method, findings, and value. Analyses and empirical results on our dataset reveal the importance of bringing structure into summaries. We believe FacetSum will spur further advances in summarization research and foster the development of NLP systems that can leverage the structured information in both long texts and summaries.
We propose Partially Interpretable Estimators (PIE) which attribute a prediction to individual features via an interpretable model, while a (possibly) small part of the PIE prediction is attributed to the interaction of features via a black-box model, with the goal to boost the predictive performance while maintaining interpretability. As such, the interpretable model captures the main contributions of features, and the black-box model attempts to complement the interpretable piece by capturing the "nuances" of feature interactions as a refinement. We design an iterative training algorithm to jointly train the two types of models. Experimental results show that PIE is highly competitive to black-box models while outperforming interpretable baselines. In addition, the understandability of PIE is comparable to simple linear models as validated via a human evaluation.
The growing popularity of Virtual Assistants poses new challenges for Entity Resolution, the task of linking mentions in text to their referent entities in a knowledge base. Specifically, in the shopping domain, customers tend to use implicit utterances (e.g., "organic milk") rather than explicit names, leading to a large number of candidate products. Meanwhile, for the same query, different customers may expect different results. For example, with "add milk to my cart", a customer may refer to a certain organic product, while some customers may want to re-order products they regularly purchase. To address these issues, we propose a new framework that leverages personalized features to improve the accuracy of product ranking. We first build a cross-source heterogeneous knowledge graph from customer purchase history and product knowledge graph to jointly learn customer and product embeddings. After that, we incorporate product, customer, and history representations into a neural reranking model to predict which candidate is most likely to be purchased for a specific customer. Experiments show that our model substantially improves the accuracy of the top ranked candidates by 24.6% compared to the state-of-the-art product search model.
To address the problem of long-tail distribution for the large vocabulary object detection task, existing methods usually divide the whole categories into several groups and treat each group with different strategies. These methods bring the following two problems. One is the training inconsistency between adjacent categories of similar sizes, and the other is that the learned model is lack of discrimination for tail categories which are semantically similar to some of the head categories. In this paper, we devise a novel Adaptive Class Suppression Loss (ACSL) to effectively tackle the above problems and improve the detection performance of tail categories. Specifically, we introduce a statistic-free perspective to analyze the long-tail distribution, breaking the limitation of manual grouping. According to this perspective, our ACSL adjusts the suppression gradients for each sample of each class adaptively, ensuring the training consistency and boosting the discrimination for rare categories. Extensive experiments on long-tail datasets LVIS and Open Images show that the our ACSL achieves 5.18% and 5.2% improvements with ResNet50-FPN, and sets a new state of the art. Code and models are available at https://github.com/CASIA-IVA-Lab/ACSL.
As gradient descent method in deep learning causes a series of questions, this paper proposes a novel gradient-free deep learning structure. By adding a new module into traditional Self-Organizing Map and introducing residual into the map, a Deep Valued Self-Organizing Map network is constructed. And analysis about the convergence performance of such a deep Valued Self-Organizing Map network is proved in this paper, which gives an inequality about the designed parameters with the dimension of inputs and the loss of prediction.
We propose a model-agnostic approach for mitigating the prediction bias of a black-box decision-maker, and in particular, a human decision-maker. Our method detects in the feature space where the black-box decision-maker is biased and replaces it with a few short decision rules, acting as a "fair surrogate". The rule-based surrogate model is trained under two objectives, predictive performance and fairness. Our model focuses on a setting that is common in practice but distinct from other literature on fairness. We only have black-box access to the model, and only a limited set of true labels can be queried under a budget constraint. We formulate a multi-objective optimization for building a surrogate model, where we simultaneously optimize for both predictive performance and bias. To train the model, we propose a novel training algorithm that combines a nondominated sorting genetic algorithm with active learning. We test our model on public datasets where we simulate various biased "black-box" classifiers (decision-makers) and apply our approach for interpretable augmented fairness.
Recent years have seen a flourishing of neural keyphrase generation works, including the release of several large-scale datasets and a host of new models to tackle them. Model performance on keyphrase generation tasks has increased significantly with evolving deep learning research. However, there lacks a comprehensive comparison among models, and an investigation on related factors (e.g., architectural choice, decoding strategy) that may affect a keyphrase generation system's performance. In this empirical study, we aim to fill this gap by providing extensive experimental results and analyzing the most crucial factors impacting the performance of keyphrase generation models. We hope this study can help clarify some of the uncertainties surrounding the keyphrase generation task and facilitate future research on this topic.
The demand for same-day delivery (SDD) has increased rapidly in the last few years and has particularly boomed during the COVID-19 pandemic. Existing literature on the problem has focused on maximizing the utility, represented as the total number of expected requests served. However, a utility-driven solution results in unequal opportunities for customers to receive delivery service, raising questions about fairness. In this paper, we study the problem of achieving fairness in SDD. We construct a regional-level fairness constraint that ensures customers from different regions have an equal chance of being served. We develop a reinforcement learning model to learn policies that focus on both overall utility and fairness. Experimental results demonstrate the ability of our approach to mitigate the unfairness caused by geographic differences and constraints of resources, at both coarser and finer-grained level and with a small cost to utility. In addition, we simulate a real-world situation where the system is suddenly overwhelmed by a surge of requests, mimicking the COVID-19 scenario. Our model is robust to the systematic pressure and is able to maintain fairness with little compromise to the utility.