Recommendation is a prevalent application of machine learning that affects many users; therefore, it is important for recommender models to be accurate and interpretable. In this work, we propose a method to both interpret and augment the predictions of black-box recommender systems. In particular, we propose to interpret feature interactions from a source recommender model and explicitly encode these interactions in a target recommender model, where both source and target models are black-boxes. By not assuming the structure of the recommender system, our approach can be used in general settings. In our experiments, we focus on a prominent use of machine learning recommendation: ad-click prediction. We found that our interaction interpretations are both informative and predictive, e.g., significantly outperforming existing recommender models. What's more, the same approach to interpret interactions can provide new insights into domains even beyond recommendation, such as text and image classification.
Machine learning transparency calls for interpretable explanations of how inputs relate to predictions. Feature attribution is a way to analyze the impact of features on predictions. Feature interactions are the contextual dependence between features that jointly impact predictions. There are a number of methods that extract feature interactions in prediction models; however, the methods that assign attributions to interactions are either uninterpretable, model-specific, or non-axiomatic. We propose an interaction attribution and detection framework called Archipelago which addresses these problems and is also scalable in real-world settings. Our experiments on standard annotation labels indicate our approach provides significantly more interpretable explanations than comparable methods, which is important for analyzing the impact of interactions on predictions. We also provide accompanying visualizations of our approach that give new insights into deep neural networks.
Modeling the dynamics of real-world physical systems is critical for spatiotemporal prediction tasks, but challenging when data is limited. The scarcity of real-world data and the difficulty in reproducing the data distribution hinder directly applying meta-learning techniques. Although the knowledge of governing partial differential equations (PDEs) of the data can be helpful for the fast adaptation to few observations, it is difficult to generalize to different or unknown dynamics. In this paper, we propose a framework, physics-aware modular meta-learning with auxiliary tasks (PiMetaL) whose spatial modules incorporate PDE-independent knowledge and temporal modules are rapidly adaptable to the limited data, respectively. The framework does not require the exact form of governing equations to model the observed spatiotemporal data. Furthermore, it mitigates the need for a large number of real-world tasks for meta-learning by leveraging simulated data. We apply the proposed framework to both synthetic and real-world spatiotemporal prediction tasks and demonstrate its superior performance with limited observations.
Third-party libraries are crucial to the development of software projects. To get suitable libraries, developers need to search through millions of libraries by filtering, evaluating, and comparing. The vast number of libraries places a barrier for programmers to locate appropriate ones. To help developers, researchers have proposed automated approaches to recommend libraries based on library usage pattern. However, these prior studies can not sufficiently match user requirements and suffer from cold-start problem. In this work, we would like to make recommendations based on requirement descriptions to avoid these problems. To this end, we propose a novel neural approach called Req2Lib which recommends libraries given descriptions of the project requirement. We use a Sequence-to-Sequence model to learn the library linked-usage information and semantic information of requirement descriptions in natural language. Besides, we apply a domain-specific pre-trained word2vec model for word embedding, which is trained over textual corpus from Stack Overflow posts. In the experiment, we train and evaluate the model with data from 5,625 java projects. Our preliminary evaluation demonstrates that Req2Lib can recommend libraries accurately.
The data-driven approach that learns an optimal representation of vision features like skeleton frames or RGB videos is currently a dominant paradigm for activity recognition. While great improvements have been achieved from existing single modal approaches with increasingly larger datasets, the fusion of various data modalities at the feature level has seldom been attempted. In this paper, we propose a multimodal feature fusion model that utilizes both skeleton and RGB modalities to infer human activity. The objective is to improve the activity recognition accuracy by effectively utilizing the mutual complemental information among different data modalities. For the skeleton modality, we propose to use a graph convolutional subnetwork to learn the skeleton representation. Whereas for the RGB modality, we will use the spatial-temporal region of interest from RGB videos and take the attention features from the skeleton modality to guide the learning process. The model could be either individually or uniformly trained by the back-propagation algorithm in an end-to-end manner. The experimental results for the NTU-RGB+D and Northwestern-UCLA Multiview datasets achieved state-of-the-art performance, which indicates that the proposed skeleton-driven attention mechanism for the RGB modality increases the mutual communication between different data modalities and brings more discriminative features for inferring human activities.
Most recent work on vision-based human activity recognition (HAR) focuses on designing complex deep learning models for the task. In so doing, there is a requirement for large datasets to be collected. As acquiring and processing large training datasets are usually very expensive, the problem of how dataset size can be reduced without affecting recognition accuracy has to be tackled. To do so, we propose a HAR method that consists of three steps: (i) data transformation involving the generation of new features based on transforming of raw data, (ii) feature extraction involving the learning of a classifier based on the AdaBoost algorithm and the use of training data consisting of the transformed features, and (iii) parameter determination and pattern recognition involving the determination of parameters based on the features generated in (ii) and the use of the parameters as training data for deep learning algorithms to be used to recognize human activities. Compared to existing approaches, this proposed approach has the advantageous characteristics that it is simple and robust. The proposed approach has been tested with a number of experiments performed on a relatively small real dataset. The experimental results indicate that using the proposed method, human activities can be more accurately recognized even with smaller training data size.
The ongoing Coronavirus Disease (COVID-19) pandemic highlights the interconnected-ness of our present-day globalized world. With social distancing policies in place, virtual communication has become an important source of (mis)information. As increasing number of people rely on social media platforms for news, identifying misinformation has emerged as a critical task in these unprecedented times. In addition to being malicious, the spread of such information poses a serious public health risk. To this end, we design a dashboard to track misinformation on popular social media news sharing platform - Twitter. The dashboard allows visibility into the social media discussions around Coronavirus and the quality of information shared on the platform, updated over time. We collect streaming data using the Twitter API from March 1, 2020 to date and identify false, misleading and clickbait contents from collected Tweets. We provide analysis of user accounts and misinformation spread across countries. In addition, we provide analysis of public sentiments on intervention policies such as "#socialdistancing" and "#workfromhome", and we track topics, and emerging hashtags and sentiments over countries. The dashboard maintains an evolving list of misinformation cascades, sentiments and emerging trends over time, accessible online at \url{https://usc-melady.github.io/COVID-19-Tweet-Analysis}.
Multi-sense word embeddings have been promising solutions for word sense learning. Nevertheless, building large-scale training corpus and learning appropriate word sense are still open issues. In this paper, we propose a method for Decomposing the word Embedding into context-specific Sense representation, called DecE2S. First, the unsupervised polysemy embedding is fed into capsule network to produce its multiple sememe-like vectors. Second, with attention operations, DecE2S integrates the word context to represent the context-specific sense vector. To train DecE2S, we design a word matching training method for learning the context-specific sense representation. DecE2S was experimentally evaluated on two sense learning tasks, i.e., word in context and word sense disambiguation. Results on two public corpora Word-in-Context and English all-words Word Sense Disambiguation show that, the DesE2S model achieves the new state-of-the-art for the word in context and word sense disambiguation tasks.
Label noise is increasingly prevalent in datasets acquired from noisy channels. Existing approaches that detect and remove label noise generally rely on some form of supervision, which is not scalable and error-prone. In this paper, we propose NoiseRank, for unsupervised label noise reduction using Markov Random Fields (MRF). We construct a dependence model to estimate the posterior probability of an instance being incorrectly labeled given the dataset, and rank instances based on their estimated probabilities. Our method 1) Does not require supervision from ground-truth labels, or priors on label or noise distribution. 2) It is interpretable by design, enabling transparency in label noise removal. 3) It is agnostic to classifier architecture/optimization framework and content modality. These advantages enable wide applicability in real noise settings, unlike prior works constrained by one or more conditions. NoiseRank improves state-of-the-art classification on Food101-N (~20% noise), and is effective on high noise Clothing-1M (~40% noise).