Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Recommendation": models, code, and papers

Saec: Similarity-Aware Embedding Compression in Recommendation Systems

Feb 26, 2019
Xiaorui Wu, Hong Xu, Honglin Zhang, Huaming Chen, Jian Wang

Production recommendation systems rely on embedding methods to represent various features. An impeding challenge in practice is that the large embedding matrix incurs substantial memory footprint in serving as the number of features grows over time. We propose a similarity-aware embedding matrix compression method called Saec to address this challenge. Saec clusters similar features within a field to reduce the embedding matrix size. Saec also adopts a fast clustering optimization based on feature frequency to drastically improve clustering time. We implement and evaluate Saec on Numerous, the production distributed machine learning system in Tencent, with 10-day worth of feature data from QQ mobile browser. Testbed experiments show that Saec reduces the number of embedding vectors by two orders of magnitude, compresses the embedding size by ~27x, and delivers the same AUC and log loss performance.

  Access Paper or Ask Questions

On the Fairness of Randomized Trials for Recommendation With Heterogeneous Demographics and Beyond

Jan 25, 2020
Zifeng Wang, Xi Chen, Rui Wen, Shao-Lun Huang

Observed events in recommendation are consequence of the decisions made by a policy, thus they are usually \emph{selectively} labeled, namely the data are \emph{Missing Not At Random} (MNAR), which often causes large bias to the estimate of true outcomes risk. A general approach to correct MNAR bias is performing small \emph{Randomized Controlled Trials} (RCTs), where an additional \emph{uniform policy} is employed to randomly assign items to each user. In this work, we concentrate on the fairness of RCTs under both homogeneous and heterogeneous demographics, especially analyzing the bias for the least favorable group on the latter setting. Considering RCTs' limitations, we propose a novel \emph{Counterfactual Robust Risk Minimization} (CRRM) framework, which is totally free of expensive RCTs, and derive its theoretical generalization error bound. At last, empirical experiments are performed on synthetic tasks and real-world data sets, substantiating our method's superiority both in fairness and generalization.

  Access Paper or Ask Questions

A Deep, Forgetful Novelty-Seeking Movie Recommender Model

Sep 02, 2019
Ruomu Zou

As more and more people shift their movie watching online, competition between movie viewing websites are getting more and more intense. Therefore, it has become incredibly important to accurately predict a given user's watching list to maximize the chances of keeping the user on the platform. Recent studies have suggested that the novelty-seeking propensity of users can impact their viewing behavior. In this paper, we aim to accurately model and describe this novelty-seeking trait across many users and timestamps driven by data, taking into consideration user forgetfulness. Compared to previous studies, we propose a more robust measure for novelty. Our model, termed Deep Forgetful Novelty-Seeking Model (DFNSM), leverages demographic information about users, genre information about movies, and novelty-seeking traits to predict the most likely next actions of a user. To evaluate the performance of our model, we conducted extensive experiments on a large movie rating dataset. The results reveal that DFNSM is very effective for movie recommendation.

* 19 pages, 14 figures, submitted as a contest entry to the S.-T. Yau High School Science Award (Computer Award) 

  Access Paper or Ask Questions

Learning over no-Preferred and Preferred Sequence of items for Robust Recommendation

Dec 12, 2020
Aleksandra Burashnikova, Marianne Clausel, Charlotte Laclau, Frack Iutzeller, Yury Maximov, Massih-Reza Amini

In this paper, we propose a theoretically founded sequential strategy for training large-scale Recommender Systems (RS) over implicit feedback, mainly in the form of clicks. The proposed approach consists in minimizing pairwise ranking loss over blocks of consecutive items constituted by a sequence of non-clicked items followed by a clicked one for each user. We present two variants of this strategy where model parameters are updated using either the momentum method or a gradient-based approach. To prevent from updating the parameters for an abnormally high number of clicks over some targeted items (mainly due to bots), we introduce an upper and a lower threshold on the number of updates for each user. These thresholds are estimated over the distribution of the number of blocks in the training set. The thresholds affect the decision of RS and imply a shift over the distribution of items that are shown to the users. Furthermore, we provide a convergence analysis of both algorithms and demonstrate their practical efficiency over six large-scale collections, both regarding different ranking measures and computational time.

* 21 pages, 9 figures. arXiv admin note: substantial text overlap with arXiv:1902.08495 

  Access Paper or Ask Questions

One4all User Representation for Recommender Systems in E-commerce

May 24, 2021
Kyuyong Shin, Hanock Kwak, Kyung-Min Kim, Minkyu Kim, Young-Jin Park, Jisu Jeong, Seungjae Jung

General-purpose representation learning through large-scale pre-training has shown promising results in the various machine learning fields. For an e-commerce domain, the objective of general-purpose, i.e., one for all, representations would be efficient applications for extensive downstream tasks such as user profiling, targeting, and recommendation tasks. In this paper, we systematically compare the generalizability of two learning strategies, i.e., transfer learning through the proposed model, ShopperBERT, vs. learning from scratch. ShopperBERT learns nine pretext tasks with 79.2M parameters from 0.8B user behaviors collected over two years to produce user embeddings. As a result, the MLPs that employ our embedding method outperform more complex models trained from scratch for five out of six tasks. Specifically, the pre-trained embeddings have superiority over the task-specific supervised features and the strong baselines, which learn the auxiliary dataset for the cold-start problem. We also show the computational efficiency and embedding visualization of the pre-trained features.

  Access Paper or Ask Questions

A Friend Recommendation System using Semantic Based KNN Algorithm

Sep 30, 2021
Srikantaiah K C, Salony Mewara, Sneha Goyal, Subhiksha S

Social networking has become a major part of all our lives and we depend on it for day to day purposes. It is a medium that is used by people all around the world even in the smallest of towns. Its main purpose is to promote and aid communication between people. Social networks, such as Facebook, Twitter etc. were created for the sole purpose of helping individuals communicate about anything with each other. These networks are becoming an important and also contemporary method to make friends from any part of this world. These new friends can communicate through any form of social media. Recommendation systems exist in all the social networks which aid users to find new friends and unite to more people and form associations and alliances with people.

* Journal of Seybold Report, VOLUME 15 ISSUE 9 2020 , page 1201-1209 

  Access Paper or Ask Questions

You Do Not Need a Bigger Boat: Recommendations at Reasonable Scale in a (Mostly) Serverless and Open Stack

Jul 15, 2021
Jacopo Tagliabue

We argue that immature data pipelines are preventing a large portion of industry practitioners from leveraging the latest research on recommender systems. We propose our template data stack for machine learning at "reasonable scale", and show how many challenges are solved by embracing a serverless paradigm. Leveraging our experience, we detail how modern open source can provide a pipeline processing terabytes of data with limited infrastructure work.

* Manuscript version of a work accepted at RecSys 2021 (camera-ready forthcoming) 

  Access Paper or Ask Questions

RecSys Challenge 2016: job recommendations based on preselection of offers and gradient boosting

Dec 03, 2016
Andrzej Pacuk, Piotr Sankowski, Karol Węgrzycki, Adam Witkowski, Piotr Wygocki

We present the Mim-Solution's approach to the RecSys Challenge 2016, which ranked 2nd. The goal of the competition was to prepare job recommendations for the users of the website Our two phase algorithm consists of candidate selection followed by the candidate ranking. We ranked the candidates by the predicted probability that the user will positively interact with the job offer. We have used Gradient Boosting Decision Trees as the regression tool.

* Proceedings of the Recommender Systems Challenge, RecSys Challenge '16, Boston, Massachusetts - September 15 - 15, 2016, pages 10:1--10:4 
* 6 pages, 1 figure, 2 tables, Description of 2nd place winning solution of RecSys 2016 Challange. To be published in RecSys'16 Challange Proceedings 

  Access Paper or Ask Questions

Hierarchical User Intent Graph Network forMultimedia Recommendation

Oct 28, 2021
Wei Yinwei, Wang Xiang, He Xiangnan, Nie Liqiang, Rui Yong, Chua Tat-Seng

In this work, we aim to learn multi-level user intents from the co-interacted patterns of items, so as to obtain high-quality representations of users and items and further enhance the recommendation performance. Towards this end, we develop a novel framework, Hierarchical User Intent Graph Network, which exhibits user intents in a hierarchical graph structure, from the fine-grained to coarse-grained intents. In particular, we get the multi-level user intents by recursively performing two operations: 1) intra-level aggregation, which distills the signal pertinent to user intents from co-interacted item graphs; and 2) inter-level aggregation, which constitutes the supernode in higher levels to model coarser-grained user intents via gathering the nodes' representations in the lower ones. Then, we refine the user and item representations as a distribution over the discovered intents, instead of simple pre-existing features. To demonstrate the effectiveness of our model, we conducted extensive experiments on three public datasets. Our model achieves significant improvements over the state-of-the-art methods, including MMGCN and DisenGCN. Furthermore, by visualizing the item representations, we provide the semantics of user intents.

  Access Paper or Ask Questions

FPSRS: A Fusion Approach for Paper Submission Recommendation System

May 12, 2022
Son T. Huynh, Nhi Dang, Dac H. Nguyen, Phong T. Huynh, Binh T. Nguyen

Recommender systems have been increasingly popular in entertainment and consumption and are evident in academics, especially for applications that suggest submitting scientific articles to scientists. However, because of the various acceptance rates, impact factors, and rankings in different publishers, searching for a proper venue or journal to submit a scientific work usually takes a lot of time and effort. In this paper, we aim to present two newer approaches extended from our paper [13] presented at the conference IAE/AIE 2021 by employing RNN structures besides using Conv1D. In addition, we also introduce a new method, namely DistilBertAims, using DistillBert for two cases of uppercase and lower-case words to vectorize features such as Title, Abstract, and Keywords, and then use Conv1d to perform feature extraction. Furthermore, we propose a new calculation method for similarity score for Aim & Scope with other features; this helps keep the weights of similarity score calculation continuously updated and then continue to fit more data. The experimental results show that the second approach could obtain a better performance, which is 62.46% and 12.44% higher than the best of the previous study [13] in terms of the Top 1 accuracy.

* 24 pages, 10 figures, 8 tables 

  Access Paper or Ask Questions