Alert button
Picture for Fuyuan Lyu

Fuyuan Lyu

Alert button

Towards Automated Negative Sampling in Implicit Recommendation

Nov 06, 2023
Fuyuan Lyu, Yaochen Hu, Xing Tang, Yingxue Zhang, Ruiming Tang, Xue Liu

Negative sampling methods are vital in implicit recommendation models as they allow us to obtain negative instances from massive unlabeled data. Most existing approaches focus on sampling hard negative samples in various ways. These studies are orthogonal to the recommendation model and implicit datasets. However, such an idea contradicts the common belief in AutoML that the model and dataset should be matched. Empirical experiments suggest that the best-performing negative sampler depends on the implicit dataset and the specific recommendation model. Hence, we propose a hypothesis that the negative sampler should align with the capacity of the recommendation models as well as the statistics of the datasets to achieve optimal performance. A mismatch between these three would result in sub-optimal outcomes. An intuitive idea to address the mismatch problem is to exhaustively select the best-performing negative sampler given the model and dataset. However, such an approach is computationally expensive and time-consuming, leaving the problem unsolved. In this work, we propose the AutoSample framework that adaptively selects the best-performing negative sampler among candidates. Specifically, we propose a loss-to-instance approximation to transform the negative sampler search task into the learning task over a weighted sum, enabling end-to-end training of the model. We also designed an adaptive search algorithm to extensively and efficiently explore the search space. A specific initialization approach is also obtained to better utilize the obtained model parameters during the search stage, which is similar to curriculum learning and leads to better performance and less computation resource consumption. We evaluate the proposed framework on four benchmarks over three models. Extensive experiments demonstrate the effectiveness and efficiency of our proposed framework.

Viaarxiv icon

Towards Hybrid-grained Feature Interaction Selection for Deep Sparse Network

Oct 30, 2023
Fuyuan Lyu, Xing Tang, Dugang Liu, Chen Ma, Weihong Luo, Liang Chen, Xiuqiang He, Xue Liu

Deep sparse networks are widely investigated as a neural network architecture for prediction tasks with high-dimensional sparse features, with which feature interaction selection is a critical component. While previous methods primarily focus on how to search feature interaction in a coarse-grained space, less attention has been given to a finer granularity. In this work, we introduce a hybrid-grained feature interaction selection approach that targets both feature field and feature value for deep sparse networks. To explore such expansive space, we propose a decomposed space which is calculated on the fly. We then develop a selection algorithm called OptFeature, which efficiently selects the feature interaction from both the feature field and the feature value simultaneously. Results from experiments on three large real-world benchmark datasets demonstrate that OptFeature performs well in terms of accuracy and efficiency. Additional studies support the feasibility of our method.

* NeurIPS 2023 poster 
Viaarxiv icon

OptMSM: Optimizing Multi-Scenario Modeling for Click-Through Rate Prediction

Jun 23, 2023
Xing Tang, Yang Qiao, Yuwen Fu, Fuyuan Lyu, Dugang Liu, Xiuqiang He

A large-scale industrial recommendation platform typically consists of multiple associated scenarios, requiring a unified click-through rate (CTR) prediction model to serve them simultaneously. Existing approaches for multi-scenario CTR prediction generally consist of two main modules: i) a scenario-aware learning module that learns a set of multi-functional representations with scenario-shared and scenario-specific information from input features, and ii) a scenario-specific prediction module that serves each scenario based on these representations. However, most of these approaches primarily focus on improving the former module and neglect the latter module. This can result in challenges such as increased model parameter size, training difficulty, and performance bottlenecks for each scenario. To address these issues, we propose a novel framework called OptMSM (\textbf{Opt}imizing \textbf{M}ulti-\textbf{S}cenario \textbf{M}odeling). First, we introduce a simplified yet effective scenario-enhanced learning module to alleviate the aforementioned challenges. Specifically, we partition the input features into scenario-specific and scenario-shared features, which are mapped to specific information embedding encodings and a set of shared information embeddings, respectively. By imposing an orthogonality constraint on the shared information embeddings to facilitate the disentanglement of shared information corresponding to each scenario, we combine them with the specific information embeddings to obtain multi-functional representations. Second, we introduce a scenario-specific hypernetwork in the scenario-specific prediction module to capture interactions within each scenario more effectively, thereby alleviating the performance bottlenecks. Finally, we conduct extensive offline experiments and an online A/B test to demonstrate the effectiveness of OptMSM.

* Accepted by ECML-PKDD 2023 Applied Data Science Track 
Viaarxiv icon

Explicit Feature Interaction-aware Uplift Network for Online Marketing

Jun 01, 2023
Dugang Liu, Xing Tang, Han Gao, Fuyuan Lyu, Xiuqiang He

Figure 1 for Explicit Feature Interaction-aware Uplift Network for Online Marketing
Figure 2 for Explicit Feature Interaction-aware Uplift Network for Online Marketing
Figure 3 for Explicit Feature Interaction-aware Uplift Network for Online Marketing
Figure 4 for Explicit Feature Interaction-aware Uplift Network for Online Marketing

As a key component in online marketing, uplift modeling aims to accurately capture the degree to which different treatments motivate different users, such as coupons or discounts, also known as the estimation of individual treatment effect (ITE). In an actual business scenario, the options for treatment may be numerous and complex, and there may be correlations between different treatments. In addition, each marketing instance may also have rich user and contextual features. However, existing methods still fall short in both fully exploiting treatment information and mining features that are sensitive to a particular treatment. In this paper, we propose an explicit feature interaction-aware uplift network (EFIN) to address these two problems. Our EFIN includes four customized modules: 1) a feature encoding module encodes not only the user and contextual features, but also the treatment features; 2) a self-interaction module aims to accurately model the user's natural response with all but the treatment features; 3) a treatment-aware interaction module accurately models the degree to which a particular treatment motivates a user through interactions between the treatment features and other features, i.e., ITE; and 4) an intervention constraint module is used to balance the ITE distribution of users between the control and treatment groups so that the model would still achieve a accurate uplift ranking on data collected from a non-random intervention marketing scenario. We conduct extensive experiments on two public datasets and one product dataset to verify the effectiveness of our EFIN. In addition, our EFIN has been deployed in a credit card bill payment scenario of a large online financial platform with a significant improvement.

* Accepted by SIGKDD 2023 Applied Data Science Track 
Viaarxiv icon

Feature Representation Learning for Click-through Rate Prediction: A Review and New Perspectives

Feb 04, 2023
Fuyuan Lyu, Xing Tang, Dugang Liu, Haolun Wu, Chen Ma, Xiuqiang He, Xue Liu

Figure 1 for Feature Representation Learning for Click-through Rate Prediction: A Review and New Perspectives
Figure 2 for Feature Representation Learning for Click-through Rate Prediction: A Review and New Perspectives
Figure 3 for Feature Representation Learning for Click-through Rate Prediction: A Review and New Perspectives
Figure 4 for Feature Representation Learning for Click-through Rate Prediction: A Review and New Perspectives

Representation learning has been a critical topic in machine learning. In Click-through Rate Prediction, most features are represented as embedding vectors and learned simultaneously with other parameters in the model. With the development of CTR models, feature representation learning has become a trending topic and has been extensively studied by both industrial and academic researchers in recent years. This survey aims at summarizing the feature representation learning in a broader picture and pave the way for future research. To achieve such a goal, we first present a taxonomy of current research methods on feature representation learning following two main issues: (i) which feature to represent and (ii) how to represent these features. Then we give a detailed description of each method regarding these two issues. Finally, the review concludes with a discussion on the future directions of this field.

* Submitted to IJCAI 2023 Survey Track 
Viaarxiv icon

Optimizing Feature Set for Click-Through Rate Prediction

Jan 26, 2023
Fuyuan Lyu, Xing Tang, Dugang Liu, Liang Chen, Xiuqiang He, Xue Liu

Figure 1 for Optimizing Feature Set for Click-Through Rate Prediction
Figure 2 for Optimizing Feature Set for Click-Through Rate Prediction
Figure 3 for Optimizing Feature Set for Click-Through Rate Prediction
Figure 4 for Optimizing Feature Set for Click-Through Rate Prediction

Click-through prediction (CTR) models transform features into latent vectors and enumerate possible feature interactions to improve performance based on the input feature set. Therefore, when selecting an optimal feature set, we should consider the influence of both feature and its interaction. However, most previous works focus on either feature field selection or only select feature interaction based on the fixed feature set to produce the feature set. The former restricts search space to the feature field, which is too coarse to determine subtle features. They also do not filter useless feature interactions, leading to higher computation costs and degraded model performance. The latter identifies useful feature interaction from all available features, resulting in many redundant features in the feature set. In this paper, we propose a novel method named OptFS to address these problems. To unify the selection of feature and its interaction, we decompose the selection of each feature interaction into the selection of two correlated features. Such a decomposition makes the model end-to-end trainable given various feature interaction operations. By adopting feature-level search space, we set a learnable gate to determine whether each feature should be within the feature set. Because of the large-scale search space, we develop a learning-by-continuation training scheme to learn such gates. Hence, OptFS generates the feature set only containing features which improve the final prediction results. Experimentally, we evaluate OptFS on three public datasets, demonstrating OptFS can optimize feature sets which enhance the model performance and further reduce both the storage and computational cost.

* Accepted by WWW 2023 Research Tracks 
Viaarxiv icon

A Survey of Diversification Techniques in Search and Recommendation

Jan 02, 2023
Haolun Wu, Yansen Zhang, Chen Ma, Fuyuan Lyu, Fernando Diaz, Xue Liu

Figure 1 for A Survey of Diversification Techniques in Search and Recommendation
Figure 2 for A Survey of Diversification Techniques in Search and Recommendation
Figure 3 for A Survey of Diversification Techniques in Search and Recommendation
Figure 4 for A Survey of Diversification Techniques in Search and Recommendation

Diversifying search results is an important research topic in retrieval systems in order to satisfy both the various interests of customers and the equal market exposure of providers. There has been a growing attention on diversity-aware research during recent years, accompanied by a proliferation of literature on methods to promote diversity in search and recommendation. However, the diversity-aware studies in retrieval systems lack a systematic organization and are rather fragmented. In this survey, we are the first to propose a unified taxonomy for classifying the metrics and approaches of diversification in both search and recommendation, which are two of the most extensively researched fields of retrieval systems. We begin the survey with a brief discussion of why diversity is important in retrieval systems, followed by a summary of the various diversity concerns in search and recommendation, highlighting their relationship and differences. For the survey's main body, we present a unified taxonomy of diversification metrics and approaches in retrieval systems, from both the search and recommendation perspectives. In the later part of the survey, we discuss the openness research questions of diversity-aware research in search and recommendation in an effort to inspire future innovations and encourage the implementation of diversity in real-world systems.

Viaarxiv icon

OptEmbed: Learning Optimal Embedding Table for Click-through Rate Prediction

Aug 09, 2022
Fuyuan Lyu, Xing Tang, Hong Zhu, Huifeng Guo, Yingxue Zhang, Ruiming Tang, Xue Liu

Figure 1 for OptEmbed: Learning Optimal Embedding Table for Click-through Rate Prediction
Figure 2 for OptEmbed: Learning Optimal Embedding Table for Click-through Rate Prediction
Figure 3 for OptEmbed: Learning Optimal Embedding Table for Click-through Rate Prediction
Figure 4 for OptEmbed: Learning Optimal Embedding Table for Click-through Rate Prediction

Learning embedding table plays a fundamental role in Click-through rate(CTR) prediction from the view of the model performance and memory usage. The embedding table is a two-dimensional tensor, with its axes indicating the number of feature values and the embedding dimension, respectively. To learn an efficient and effective embedding table, recent works either assign various embedding dimensions for feature fields and reduce the number of embeddings respectively or mask the embedding table parameters. However, all these existing works cannot get an optimal embedding table. On the one hand, various embedding dimensions still require a large amount of memory due to the vast number of features in the dataset. On the other hand, decreasing the number of embeddings usually suffers from performance degradation, which is intolerable in CTR prediction. Finally, pruning embedding parameters will lead to a sparse embedding table, which is hard to be deployed. To this end, we propose an optimal embedding table learning framework OptEmbed, which provides a practical and general method to find an optimal embedding table for various base CTR models. Specifically, we propose pruning the redundant embeddings regarding corresponding features' importance by learnable pruning thresholds. Furthermore, we consider assigning various embedding dimensions as one single candidate architecture. To efficiently search the optimal embedding dimensions, we design a uniform embedding dimension sampling scheme to equally train all candidate architectures, meaning architecture-related parameters and learnable thresholds are trained simultaneously in one supernet. We then propose an evolution search method based on the supernet to find the optimal embedding dimensions for each field. Experiments on public datasets show that OptEmbed can learn a compact embedding table which can further improve the model performance.

* Accepted by CIKM 2022 Research Track 
Viaarxiv icon

Encoder-Decoder Architecture for Supervised Dynamic Graph Learning: A Survey

Mar 27, 2022
Yuecai Zhu, Fuyuan Lyu, Chengming Hu, Xi Chen, Xue Liu

Figure 1 for Encoder-Decoder Architecture for Supervised Dynamic Graph Learning: A Survey
Figure 2 for Encoder-Decoder Architecture for Supervised Dynamic Graph Learning: A Survey
Figure 3 for Encoder-Decoder Architecture for Supervised Dynamic Graph Learning: A Survey
Figure 4 for Encoder-Decoder Architecture for Supervised Dynamic Graph Learning: A Survey

In recent years, the prevalent online services generate a sheer volume of user activity data. Service providers collect these data in order to perform client behavior analysis, and offer better and more customized services. Majority of these data can be modeled and stored as graph, such as the social graph in Facebook, user-video interaction graph in Youtube. These graphs need to evolve over time to capture the dynamics in the real world, leading to the invention of dynamic graphs. However, the temporal information embedded in the dynamic graphs brings new challenges in analyzing and deploying them. Events staleness, temporal information learning and explicit time dimension usage are some example challenges in dynamic graph learning. In order to offer a convenient reference to both the industry and academia, this survey presents the Three Stages Recurrent Temporal Learning Framework based on dynamic graph evolution theories, so as to interpret the learning of temporal information with a generalized framework. Under this framework, this survey categories and reviews different learnable encoder-decoder architectures for supervised dynamic graph learning. We believe that this survey could supply useful guidelines to researchers and engineers in finding suitable graph structures for their dynamic learning tasks.

* Optimize title for better visibility 
Viaarxiv icon