Alert button
Picture for Xinyi Dai

Xinyi Dai

Alert button

MAP: A Model-agnostic Pretraining Framework for Click-through Rate Prediction

Aug 03, 2023
Jianghao Lin, Yanru Qu, Wei Guo, Xinyi Dai, Ruiming Tang, Yong Yu, Weinan Zhang

Figure 1 for MAP: A Model-agnostic Pretraining Framework for Click-through Rate Prediction
Figure 2 for MAP: A Model-agnostic Pretraining Framework for Click-through Rate Prediction
Figure 3 for MAP: A Model-agnostic Pretraining Framework for Click-through Rate Prediction
Figure 4 for MAP: A Model-agnostic Pretraining Framework for Click-through Rate Prediction

With the widespread application of personalized online services, click-through rate (CTR) prediction has received more and more attention and research. The most prominent features of CTR prediction are its multi-field categorical data format, and vast and daily-growing data volume. The large capacity of neural models helps digest such massive amounts of data under the supervised learning paradigm, yet they fail to utilize the substantial data to its full potential, since the 1-bit click signal is not sufficient to guide the model to learn capable representations of features and instances. The self-supervised learning paradigm provides a more promising pretrain-finetune solution to better exploit the large amount of user click logs, and learn more generalized and effective representations. However, self-supervised learning for CTR prediction is still an open question, since current works on this line are only preliminary and rudimentary. To this end, we propose a Model-agnostic pretraining (MAP) framework that applies feature corruption and recovery on multi-field categorical data, and more specifically, we derive two practical algorithms: masked feature prediction (MFP) and replaced feature detection (RFD). MFP digs into feature interactions within each instance through masking and predicting a small portion of input features, and introduces noise contrastive estimation (NCE) to handle large feature spaces. RFD further turns MFP into a binary classification mode through replacing and detecting changes in input features, making it even simpler and more effective for CTR pretraining. Our extensive experiments on two real-world large-scale datasets (i.e., Avazu, Criteo) demonstrate the advantages of these two methods on several strong backbones (e.g., DCNv2, DeepFM), and achieve new state-of-the-art performance in terms of both effectiveness and efficiency for CTR prediction.

* KDD 2023 
Viaarxiv icon

How Can Recommender Systems Benefit from Large Language Models: A Survey

Jun 28, 2023
Jianghao Lin, Xinyi Dai, Yunjia Xi, Weiwen Liu, Bo Chen, Xiangyang Li, Chenxu Zhu, Huifeng Guo, Yong Yu, Ruiming Tang, Weinan Zhang

Figure 1 for How Can Recommender Systems Benefit from Large Language Models: A Survey
Figure 2 for How Can Recommender Systems Benefit from Large Language Models: A Survey
Figure 3 for How Can Recommender Systems Benefit from Large Language Models: A Survey
Figure 4 for How Can Recommender Systems Benefit from Large Language Models: A Survey

Recommender systems (RS) play important roles to match users' information needs for Internet applications. In natural language processing (NLP) domains, large language model (LLM) has shown astonishing emergent abilities (e.g., instruction following, reasoning), thus giving rise to the promising research direction of adapting LLM to RS for performance enhancements and user experience improvements. In this paper, we conduct a comprehensive survey on this research direction from an application-oriented view. We first summarize existing research works from two orthogonal perspectives: where and how to adapt LLM to RS. For the "WHERE" question, we discuss the roles that LLM could play in different stages of the recommendation pipeline, i.e., feature engineering, feature encoder, scoring/ranking function, and pipeline controller. For the "HOW" question, we investigate the training and inference strategies, resulting in two fine-grained taxonomy criteria, i.e., whether to tune LLMs or not, and whether to involve conventional recommendation model (CRM) for inference. Detailed analysis and general development trajectories are provided for both questions, respectively. Then, we highlight key challenges in adapting LLM to RS from three aspects, i.e., efficiency, effectiveness, and ethics. Finally, we summarize the survey and discuss the future prospects. We also actively maintain a GitHub repository for papers and other related resources in this rising direction: https://github.com/CHIANGEL/Awesome-LLM-for-RecSys.

* 15 pages; 3 figures; summarization table in appendix 
Viaarxiv icon

A Bird's-eye View of Reranking: from List Level to Page Level

Nov 17, 2022
Yunjia Xi, Jianghao Lin, Weiwen Liu, Xinyi Dai, Weinan Zhang, Rui Zhang, Ruiming Tang, Yong Yu

Figure 1 for A Bird's-eye View of Reranking: from List Level to Page Level
Figure 2 for A Bird's-eye View of Reranking: from List Level to Page Level
Figure 3 for A Bird's-eye View of Reranking: from List Level to Page Level
Figure 4 for A Bird's-eye View of Reranking: from List Level to Page Level

Reranking, as the final stage of multi-stage recommender systems, refines the initial lists to maximize the total utility. With the development of multimedia and user interface design, the recommendation page has evolved to a multi-list style. Separately employing traditional list-level reranking methods for different lists overlooks the inter-list interactions and the effect of different page formats, thus yielding suboptimal reranking performance. Moreover, simply applying a shared network for all the lists fails to capture the commonalities and distinctions in user behaviors on different lists. To this end, we propose to draw a bird's-eye view of \textbf{page-level reranking} and design a novel Page-level Attentional Reranking (PAR) model. We introduce a hierarchical dual-side attention module to extract personalized intra- and inter-list interactions. A spatial-scaled attention network is devised to integrate the spatial relationship into pairwise item influences, which explicitly models the page format. The multi-gated mixture-of-experts module is further applied to capture the commonalities and differences of user behaviors between different lists. Extensive experiments on a public dataset and a proprietary dataset show that PAR significantly outperforms existing baseline models.

* WSDM 2023. More readable and full version 
Viaarxiv icon

Understanding or Manipulation: Rethinking Online Performance Gains of Modern Recommender Systems

Oct 11, 2022
Zhengbang Zhu, Rongjun Qin, Junjie Huang, Xinyi Dai, Yang Yu, Yong Yu, Weinan Zhang

Figure 1 for Understanding or Manipulation: Rethinking Online Performance Gains of Modern Recommender Systems
Figure 2 for Understanding or Manipulation: Rethinking Online Performance Gains of Modern Recommender Systems
Figure 3 for Understanding or Manipulation: Rethinking Online Performance Gains of Modern Recommender Systems
Figure 4 for Understanding or Manipulation: Rethinking Online Performance Gains of Modern Recommender Systems

Recommender systems are expected to be assistants that help human users find relevant information in an automatic manner without explicit queries. As recommender systems evolve, increasingly sophisticated learning techniques are applied and have achieved better performance in terms of user engagement metrics such as clicks and browsing time. The increase of the measured performance, however, can have two possible attributions: a better understanding of user preferences, and a more proactive ability to utilize human bounded rationality to seduce user over-consumption. A natural following question is whether current recommendation algorithms are manipulating user preferences. If so, can we measure the manipulation level? In this paper, we present a general framework for benchmarking the degree of manipulations of recommendation algorithms, in both slate recommendation and sequential recommendation scenarios. The framework consists of three stages, initial preference calculation, algorithm training and interaction, and metrics calculation that involves two proposed metrics, Manipulation Score and Preference Shift. We benchmark some representative recommendation algorithms in both synthetic and real-world datasets under the proposed framework. We have observed that a high online click-through rate does not mean a better understanding of user initial preference, but ends in prompting users to choose more documents they initially did not favor. Moreover, we find that the properties of training data have notable impacts on the manipulation degrees, and algorithms with more powerful modeling abilities are more sensitive to such impacts. The experiments also verified the usefulness of the proposed metrics for measuring the degree of manipulations. We advocate that future recommendation algorithm studies should be treated as an optimization problem with constrained user preference manipulations.

* 28 pages 
Viaarxiv icon

An advanced combination of semi-supervised Normalizing Flow & Yolo (YoloNF) to detect and recognize vehicle license plates

Jul 21, 2022
Khalid Oublal, Xinyi Dai

Figure 1 for An advanced combination of semi-supervised Normalizing Flow & Yolo (YoloNF) to detect and recognize vehicle license plates
Figure 2 for An advanced combination of semi-supervised Normalizing Flow & Yolo (YoloNF) to detect and recognize vehicle license plates
Figure 3 for An advanced combination of semi-supervised Normalizing Flow & Yolo (YoloNF) to detect and recognize vehicle license plates
Figure 4 for An advanced combination of semi-supervised Normalizing Flow & Yolo (YoloNF) to detect and recognize vehicle license plates

Fully Automatic License Plate Recognition (ALPR) has been a frequent research topic due to several practical applications. However, many of the current solutions are still not robust enough in real situations, commonly depending on many constraints. This paper presents a robust and efficient ALPR system based on the state-of-the-art YOLO object detector and Normalizing flows. The model uses two new strategies. Firstly, a two-stage network using YOLO and a normalization flow-based model for normalization to detect Licenses Plates (LP) and recognize the LP with numbers and Arabic characters. Secondly, Multi-scale image transformations are implemented to provide a solution to the problem of the YOLO cropped LP detection including significant background noise. Furthermore, extensive experiments are led on a new dataset with realistic scenarios, we introduce a larger public annotated dataset collected from Moroccan plates. We demonstrate that our proposed model can learn on a small number of samples free of single or multiple characters. The dataset will also be made publicly available to encourage further studies and research on plate detection and recognition.

* arXiv admin note: text overlap with arXiv:1802.09567 by other authors; text overlap with arXiv:2012.06737 by other authors without attribution 
Viaarxiv icon

A Graph-Enhanced Click Model for Web Search

Jun 17, 2022
Jianghao Lin, Weiwen Liu, Xinyi Dai, Weinan Zhang, Shuai Li, Ruiming Tang, Xiuqiang He, Jianye Hao, Yong Yu

Figure 1 for A Graph-Enhanced Click Model for Web Search
Figure 2 for A Graph-Enhanced Click Model for Web Search
Figure 3 for A Graph-Enhanced Click Model for Web Search
Figure 4 for A Graph-Enhanced Click Model for Web Search

To better exploit search logs and model users' behavior patterns, numerous click models are proposed to extract users' implicit interaction feedback. Most traditional click models are based on the probabilistic graphical model (PGM) framework, which requires manually designed dependencies and may oversimplify user behaviors. Recently, methods based on neural networks are proposed to improve the prediction accuracy of user behaviors by enhancing the expressive ability and allowing flexible dependencies. However, they still suffer from the data sparsity and cold-start problems. In this paper, we propose a novel graph-enhanced click model (GraphCM) for web search. Firstly, we regard each query or document as a vertex, and propose novel homogeneous graph construction methods for queries and documents respectively, to fully exploit both intra-session and inter-session information for the sparsity and cold-start problems. Secondly, following the examination hypothesis, we separately model the attractiveness estimator and examination predictor to output the attractiveness scores and examination probabilities, where graph neural networks and neighbor interaction techniques are applied to extract the auxiliary information encoded in the pre-constructed homogeneous graphs. Finally, we apply combination functions to integrate examination probabilities and attractiveness scores into click predictions. Extensive experiments conducted on three real-world session datasets show that GraphCM not only outperforms the state-of-art models, but also achieves superior performance in addressing the data sparsity and cold-start problems.

* 10 pages; Accepted by SIGIR 2021 
Viaarxiv icon

Multi-Level Interaction Reranking with User Behavior History

Apr 20, 2022
Yunjia Xi, Weiwen Liu, Jieming Zhu, Xilong Zhao, Xinyi Dai, Ruiming Tang, Weinan Zhang, Rui Zhang, Yong Yu

Figure 1 for Multi-Level Interaction Reranking with User Behavior History
Figure 2 for Multi-Level Interaction Reranking with User Behavior History
Figure 3 for Multi-Level Interaction Reranking with User Behavior History
Figure 4 for Multi-Level Interaction Reranking with User Behavior History

As the final stage of the multi-stage recommender system (MRS), reranking directly affects users' experience and satisfaction, thus playing a critical role in MRS. Despite the improvement achieved in the existing work, three issues are yet to be solved. First, users' historical behaviors contain rich preference information, such as users' long and short-term interests, but are not fully exploited in reranking. Previous work typically treats items in history equally important, neglecting the dynamic interaction between the history and candidate items. Second, existing reranking models focus on learning interactions at the item level while ignoring the fine-grained feature-level interactions. Lastly, estimating the reranking score on the ordered initial list before reranking may lead to the early scoring problem, thereby yielding suboptimal reranking performance. To address the above issues, we propose a framework named Multi-level Interaction Reranking (MIR). MIR combines low-level cross-item interaction and high-level set-to-list interaction, where we view the candidate items to be reranked as a set and the users' behavior history in chronological order as a list. We design a novel SLAttention structure for modeling the set-to-list interactions with personalized long-short term interests. Moreover, feature-level interactions are incorporated to capture the fine-grained influence among items. We design MIR in such a way that any permutation of the input items would not change the output ranking, and we theoretically prove it. Extensive experiments on three public and proprietary datasets show that MIR significantly outperforms the state-of-the-art models using various ranking and utility metrics.

Viaarxiv icon

Towards Return Parity in Markov Decision Processes

Nov 19, 2021
Jianfeng Chi, Jian Shen, Xinyi Dai, Weinan Zhang, Yuan Tian, Han Zhao

Figure 1 for Towards Return Parity in Markov Decision Processes
Figure 2 for Towards Return Parity in Markov Decision Processes
Figure 3 for Towards Return Parity in Markov Decision Processes
Figure 4 for Towards Return Parity in Markov Decision Processes

Algorithmic decisions made by machine learning models in high-stakes domains may have lasting impacts over time. Unfortunately, naive applications of standard fairness criterion in static settings over temporal domains may lead to delayed and adverse effects. To understand the dynamics of performance disparity, we study a fairness problem in Markov decision processes (MDPs). Specifically, we propose return parity, a fairness notion that requires MDPs from different demographic groups that share the same state and action spaces to achieve approximately the same expected time-discounted rewards. We first provide a decomposition theorem for return disparity, which decomposes the return disparity of any two MDPs into the distance between group-wise reward functions, the discrepancy of group policies, and the discrepancy between state visitation distributions induced by the group policies. Motivated by our decomposition theorem, we propose algorithms to mitigate return disparity via learning a shared group policy with state visitation distributional alignment using integral probability metrics. We conduct experiments to corroborate our results, showing that the proposed algorithm can successfully close the disparity gap while maintaining the performance of policies on two real-world recommender system benchmark datasets.

Viaarxiv icon

Context-aware Reranking with Utility Maximization for Recommendation

Oct 18, 2021
Yunjia Xi, Weiwen Liu, Xinyi Dai, Ruiming Tang, Weinan Zhang, Qing Liu, Xiuqiang He, Yong Yu

Figure 1 for Context-aware Reranking with Utility Maximization for Recommendation
Figure 2 for Context-aware Reranking with Utility Maximization for Recommendation
Figure 3 for Context-aware Reranking with Utility Maximization for Recommendation
Figure 4 for Context-aware Reranking with Utility Maximization for Recommendation

As a critical task for large-scale commercial recommender systems, reranking has shown the potential of improving recommendation results by uncovering mutual influence among items. Reranking rearranges items in the initial ranking lists from the previous ranking stage to better meet users' demands. However, rather than considering the context of initial lists as most existing methods do, an ideal reranking algorithm should consider the counterfactual context -- the position and the alignment of the items in the reranked lists. In this work, we propose a novel pairwise reranking framework, Context-aware Reranking with Utility Maximization for recommendation (CRUM), which maximizes the overall utility after reranking efficiently. Specifically, we first design a utility-oriented evaluator, which applies Bi-LSTM and graph attention mechanism to estimate the listwise utility via the counterfactual context modeling. Then, under the guidance of the evaluator, we propose a pairwise reranker model to find the most suitable position for each item by swapping misplaced item pairs. Extensive experiments on two benchmark datasets and a proprietary real-world dataset demonstrate that CRUM significantly outperforms the state-of-the-art models in terms of both relevance-based metrics and utility-based metrics.

Viaarxiv icon