Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Recommendation": models, code, and papers

GRAM: Fast Fine-tuning of Pre-trained Language Models for Content-based Collaborative Filtering

Apr 08, 2022
Yoonseok Yang, Kyu Seok Kim, Minsam Kim, Juneyoung Park

Content-based collaborative filtering (CCF) provides personalized item recommendations based on both users' interaction history and items' content information. Recently, pre-trained language models (PLM) have been used to extract high-quality item encodings for CCF. However, it is resource-intensive to finetune PLM in an end-to-end (E2E) manner in CCF due to its multi-modal nature: optimization involves redundant content encoding for interactions from users. For this, we propose GRAM (GRadient Accumulation for Multi-modality): (1) Single-step GRAM which aggregates gradients for each item while maintaining theoretical equivalence with E2E, and (2) Multi-step GRAM which further accumulates gradients across multiple training steps, with less than 40\% GPU memory footprint of E2E. We empirically confirm that GRAM achieves a remarkable boost in training efficiency based on five datasets from two task domains of Knowledge Tracing and News Recommendation, where single-step and multi-step GRAM achieve 4x and 45x training speedup on average, respectively.

* NAACL 2022 Main Conference 

  Access Paper or Ask Questions

Hierarchical Latent Relation Modeling for Collaborative Metric Learning

Jul 26, 2021
Viet-Anh Tran, Guillaume Salha-Galvan, Romain Hennequin, Manuel Moussallam

Collaborative Metric Learning (CML) recently emerged as a powerful paradigm for recommendation based on implicit feedback collaborative filtering. However, standard CML methods learn fixed user and item representations, which fails to capture the complex interests of users. Existing extensions of CML also either ignore the heterogeneity of user-item relations, i.e. that a user can simultaneously like very different items, or the latent item-item relations, i.e. that a user's preference for an item depends, not only on its intrinsic characteristics, but also on items they previously interacted with. In this paper, we present a hierarchical CML model that jointly captures latent user-item and item-item relations from implicit data. Our approach is inspired by translation mechanisms from knowledge graph embedding and leverages memory-based attention networks. We empirically show the relevance of this joint relational modeling, by outperforming existing CML models on recommendation tasks on several real-world datasets. Our experiments also emphasize the limits of current CML relational models on very sparse datasets.

* 15th ACM Conference on Recommender Systems (RecSys 2021) 

  Access Paper or Ask Questions

Identifying Causal Effect Inference Failure with Uncertainty-Aware Models

Jul 01, 2020
Andrew Jesson, Sören Mindermann, Uri Shalit, Yarin Gal

Recommending the best course of action for an individual is a major application of individual-level causal effect estimation. This application is often needed in safety-critical domains such as healthcare, where estimating and communicating uncertainty to decision-makers is crucial. We introduce a practical approach for integrating uncertainty estimation into a class of state-of-the-art neural network methods used for individual-level causal estimates. We show that our methods enable us to deal gracefully with situations of "no-overlap", common in high-dimensional data, where standard applications of causal effect approaches fail. Further, our methods allow us to handle covariate shift, where test distribution differs to train distribution, common when systems are deployed in practice. We show that when such a covariate shift occurs, correctly modeling uncertainty can keep us from giving overconfident and potentially harmful recommendations. We demonstrate our methodology with a range of state-of-the-art models. Under both covariate shift and lack of overlap, our uncertainty-equipped methods can alert decisions makers when predictions are not to be trusted while outperforming their uncertainty-oblivious counterparts.

  Access Paper or Ask Questions

SupRB: A Supervised Rule-based Learning System for Continuous Problems

Feb 24, 2020
Michael Heider, David Pätzel, Jörg Hähner

We propose the SupRB learning system, a new Pittsburgh-style learning classifier system (LCS) for supervised learning on multi-dimensional continuous decision problems. SupRB learns an approximation of a quality function from examples (consisting of situations, choices and associated qualities) and is then able to make an optimal choice as well as predict the quality of a choice in a given situation. One area of application for SupRB is parametrization of industrial machinery. In this field, acceptance of the recommendations of machine learning systems is highly reliant on operators' trust. While an essential and much-researched ingredient for that trust is prediction quality, it seems that this alone is not enough. At least as important is a human-understandable explanation of the reasoning behind a recommendation. While many state-of-the-art methods such as artificial neural networks fall short of this, LCSs such as SupRB provide human-readable rules that can be understood very easily. The prevalent LCSs are not directly applicable to this problem as they lack support for continuous choices. This paper lays the foundations for SupRB and shows its general applicability on a simplified model of an additive manufacturing problem.

* Submitted to the Genetic and Evolutionary Computation Conference 2020 (GECCO 2020) 

  Access Paper or Ask Questions

Product Knowledge Graph Embedding for E-commerce

Nov 28, 2019
Da Xu, Chuanwei Ruan, Evren Korpeoglu, Sushant Kumar, Kannan Achan

In this paper, we propose a new product knowledge graph (PKG) embedding approach for learning the intrinsic product relations as product knowledge for e-commerce. We define the key entities and summarize the pivotal product relations that are critical for general e-commerce applications including marketing, advertisement, search ranking and recommendation. We first provide a comprehensive comparison between PKG and ordinary knowledge graph (KG) and then illustrate why KG embedding methods are not suitable for PKG learning. We construct a self-attention-enhanced distributed representation learning model for learning PKG embeddings from raw customer activity data in an end-to-end fashion. We design an effective multi-task learning schema to fully leverage the multi-modal e-commerce data. The Poincare embedding is also employed to handle complex entity structures. We use a real-world dataset from to evaluate the performances on knowledge completion, search ranking and recommendation. The proposed approach compares favourably to baselines in knowledge completion and downstream tasks.

  Access Paper or Ask Questions

Improving Contrastive Learning with Model Augmentation

Mar 25, 2022
Zhiwei Liu, Yongjun Chen, Jia Li, Man Luo, Philip S. Yu, Caiming Xiong

The sequential recommendation aims at predicting the next items in user behaviors, which can be solved by characterizing item relationships in sequences. Due to the data sparsity and noise issues in sequences, a new self-supervised learning (SSL) paradigm is proposed to improve the performance, which employs contrastive learning between positive and negative views of sequences. However, existing methods all construct views by adopting augmentation from data perspectives, while we argue that 1) optimal data augmentation methods are hard to devise, 2) data augmentation methods destroy sequential correlations, and 3) data augmentation fails to incorporate comprehensive self-supervised signals. Therefore, we investigate the possibility of model augmentation to construct view pairs. We propose three levels of model augmentation methods: neuron masking, layer dropping, and encoder complementing. This work opens up a novel direction in constructing views for contrastive SSL. Experiments verify the efficacy of model augmentation for the SSL in the sequential recommendation. Code is available\footnote{\url{}}.

* Preprint. Still under reivew 

  Access Paper or Ask Questions

A Survey of Generalisation in Deep Reinforcement Learning

Nov 18, 2021
Robert Kirk, Amy Zhang, Edward Grefenstette, Tim Rocktäschel

The study of generalisation in deep Reinforcement Learning (RL) aims to produce RL algorithms whose policies generalise well to novel unseen situations at deployment time, avoiding overfitting to their training environments. Tackling this is vital if we are to deploy reinforcement learning algorithms in real world scenarios, where the environment will be diverse, dynamic and unpredictable. This survey is an overview of this nascent field. We provide a unifying formalism and terminology for discussing different generalisation problems, building upon previous works. We go on to categorise existing benchmarks for generalisation, as well as current methods for tackling the generalisation problem. Finally, we provide a critical discussion of the current state of the field, including recommendations for future work. Among other conclusions, we argue that taking a purely procedural content generation approach to benchmark design is not conducive to progress in generalisation, we suggest fast online adaptation and tackling RL-specific problems as some areas for future work on methods for generalisation, and we recommend building benchmarks in underexplored problem settings such as offline RL generalisation and reward-function variation.

  Access Paper or Ask Questions

A Mining Software Repository Extended Cookbook: Lessons learned from a literature review

Oct 08, 2021
Daniel Barros, Flavio Horita, Igor Wiese, Kanan Silva

The main purpose of Mining Software Repositories (MSR) is to discover the latest enhancements and provide an insight into how to make improvements in a software project. In light of it, this paper updates the MSR findings of the original MSR Cookbook, by first conducting a systematic mapping study to elicit and analyze the state-of-the-art, and then proposing an extended version of the Cookbook. This extended Cookbook was built on four high-level themes, which were derived from the analysis of a list of 112 selected studies. Hence, it was used to consolidate the extended Cookbook as a contribution to practice and research in the following areas by: 1) including studies published in all available and relevant publication venues; 2) including and updating recommendations in all four high-level themes, with an increase of 84% in comments in this study when compared with the original MSR Cookbook; 3) summarizing the tools employed for each high-level theme; and 4) providing lessons learned for future studies. Thus, the extended Cookbook examined in this work can support new research projects, as upgraded recommendations and the lessons learned are available with the aid of samples and tools.

  Access Paper or Ask Questions

SPECTER: Document-level Representation Learning using Citation-informed Transformers

May 20, 2020
Arman Cohan, Sergey Feldman, Iz Beltagy, Doug Downey, Daniel S. Weld

Representation learning is a critical ingredient for natural language processing systems. Recent Transformer language models like BERT learn powerful textual representations, but these models are targeted towards token- and sentence-level training objectives and do not leverage information on inter-document relatedness, which limits their document-level representation power. For applications on scientific documents, such as classification and recommendation, the embeddings power strong performance on end tasks. We propose SPECTER, a new method to generate document-level embedding of scientific documents based on pretraining a Transformer language model on a powerful signal of document-level relatedness: the citation graph. Unlike existing pretrained language models, SPECTER can be easily applied to downstream applications without task-specific fine-tuning. Additionally, to encourage further research on document-level models, we introduce SciDocs, a new evaluation benchmark consisting of seven document-level tasks ranging from citation prediction, to document classification and recommendation. We show that SPECTER outperforms a variety of competitive baselines on the benchmark.

* ACL 2020 

  Access Paper or Ask Questions