Recommender system is an essential part of online services, especially for e-commerce platform. Conversion Rate (CVR) prediction in RS plays a significant role in optimizing Gross Merchandise Volume (GMV) goal of e-commerce. However, CVR suffers from well-known Sample Selection Bias (SSB) and Data Sparsity (DS) problems. Although existing methods ESMM and ESM2 train with all impression samples over the entire space by modeling user behavior paths, SSB and DS problems still exist. In real practice, the online inference space are samples from previous stage of RS process, rather than the impression space modeled by existing methods. Moreover, existing methods solve the DS problem mainly by building behavior paths of their own specific scene, ignoring the behaviors in various scenes of e-commerce platform. In this paper, we propose Entire Space Learning Framework: Unbias Conversion Rate Prediction in Full Stages of Recommender System, solving SSB and DS problems by reformulating GMV goal in a novel manner. Specifically, we rebuild the CVR on the entire data space with samples from previous stage of RS process, unifying training and online inference space. Moreover, we explicitly introduce purchase samples from other scenes of e-commerce platform in model learning process. Online A/B test and offline experiments show the superiority of our framework. Our framework has been deployed in rank stage of Taobao recommendation, providing recommendation service for hundreds of millions of consumers everyday.
In this era of information explosion, a personalized recommendation system is convenient for users to get information they are interested in. To deal with billions of users and items, large-scale online recommendation services usually consist of three stages: candidate generation, coarse-grained ranking, and fine-grained ranking. The success of each stage depends on whether the model accurately captures the interests of users, which are usually hidden in users' behavior data. Previous research shows that users' interests are diverse, and one vector is not sufficient to capture users' different preferences. Therefore, many methods use multiple vectors to encode users' interests. However, there are two unsolved problems: (1) The similarity of different vectors in existing methods is too high, with too much redundant information. Consequently, the interests of users are not fully represented. (2) Existing methods model the long-term and short-term behaviors together, ignoring the differences between them. This paper proposes a Hierarchical Multi-Interest Co-Network (HCN) to capture users' diverse interests in the coarse-grained ranking stage. Specifically, we design a hierarchical multi-interest extraction layer to update users' diverse interest centers iteratively. The multiple embedded vectors obtained in this way contain more information and represent the interests of users better in various aspects. Furthermore, we develop a Co-Interest Network to integrate users' long-term and short-term interests. Experiments on several real-world datasets and one large-scale industrial dataset show that HCN effectively outperforms the state-of-the-art methods. We deploy HCN into a large-scale real world E-commerce system and achieve extra 2.5\% improvements on GMV (Gross Merchandise Value).
Recent studies on Click-Through Rate (CTR) prediction has reached new levels by modeling longer user behavior sequences. Among others, the two-stage methods stand out as the state-of-the-art (SOTA) solution for industrial applications. The two-stage methods first train a retrieval model to truncate the long behavior sequence beforehand and then use the truncated sequences to train a CTR model. However, the retrieval model and the CTR model are trained separately. So the retrieved subsequences in the CTR model is inaccurate, which degrades the final performance. In this paper, we propose an end-to-end paradigm to model long behavior sequences, which is able to achieve superior performance along with remarkable cost-efficiency compared to existing models. Our contribution is three-fold: First, we propose a hashing-based efficient target attention (TA) network named ETA-Net to enable end-to-end user behavior retrieval based on low-cost bit-wise operations. The proposed ETA-Net can reduce the complexity of standard TA by orders of magnitude for sequential data modeling. Second, we propose a general system architecture as one viable solution to deploy ETA-Net on industrial systems. Particularly, ETA-Net has been deployed on the recommender system of Taobao, and brought 1.8% lift on CTR and 3.1% lift on Gross Merchandise Value (GMV) compared to the SOTA two-stage methods. Third, we conduct extensive experiments on both offline datasets and online A/B test. The results verify that the proposed model outperforms existing CTR models considerably, in terms of both CTR prediction performance and online cost-efficiency. ETA-Net now serves the main traffic of Taobao, delivering services to hundreds of millions of users towards billions of items every day.
Click-Through Rate (CTR) prediction is one of the core tasks in recommender systems (RS). It predicts a personalized click probability for each user-item pair. Recently, researchers have found that the performance of CTR model can be improved greatly by taking user behavior sequence into consideration, especially long-term user behavior sequence. The report on an e-commerce website shows that 23\% of users have more than 1000 clicks during the past 5 months. Though there are numerous works focus on modeling sequential user behaviors, few works can handle long-term user behavior sequence due to the strict inference time constraint in real world system. Two-stage methods are proposed to push the limit for better performance. At the first stage, an auxiliary task is designed to retrieve the top-$k$ similar items from long-term user behavior sequence. At the second stage, the classical attention mechanism is conducted between the candidate item and $k$ items selected in the first stage. However, information gap happens between retrieval stage and the main CTR task. This goal divergence can greatly diminishing the performance gain of long-term user sequence. In this paper, inspired by Reformer, we propose a locality-sensitive hashing (LSH) method called ETA (End-to-end Target Attention) which can greatly reduce the training and inference cost and make the end-to-end training with long-term user behavior sequence possible. Both offline and online experiments confirm the effectiveness of our model. We deploy ETA into a large-scale real world E-commerce system and achieve extra 3.1\% improvements on GMV (Gross Merchandise Value) compared to a two-stage long user sequence CTR model.
Routing strategies for traffics and vehicles have been historically studied. However, in the absence of considering drivers' preferences, current route planning algorithms are developed under ideal situations where all drivers are expected to behave rationally and properly. Especially, for jumbled urban road networks, drivers' actual routing strategies deteriorated to a series of empirical and selfish decisions that result in congestion. Self-evidently, if minimum mobility can be kept, traffic congestion is avoidable by traffic load dispersing. In this paper, we establish a novel dynamic routing method catering drivers' preferences and retaining maximum traffic mobility simultaneously through multi-agent systems (MAS). Modeling human-drivers' behavior through agents' dynamics, MAS can analyze the global behavior of the entire traffic flow. Therefore, regarding agents as particles in smoothed particles hydrodynamics (SPH), we can enforce the traffic flow to behave like a real flow. Thereby, with the characteristic of distributing itself uniformly in road networks, our dynamic routing method realizes traffic load balancing without violating the individual time-saving motivation. Moreover, as a discrete control mechanism, our method is robust to chaos meaning driver's disobedience can be tolerated. As controlled by SPH based density, the only intelligent transportation system (ITS) we require is the location-based service (LBS). A mathematical proof is accomplished to scrutinize the stability of the proposed control law. Also, multiple testing cases are built to verify the effectiveness of the proposed dynamic routing algorithm.
Deep learning based methods have been widely used in industrial recommendation systems (RSs). Previous works adopt an Embedding&MLP paradigm: raw features are embedded into low-dimensional vectors, which are then fed on to MLP for final recommendations. However, most of these works just concatenate different features, ignoring the sequential nature of users' behaviors. In this paper, we propose to use the powerful Transformer model to capture the sequential signals underlying users' behavior sequences for recommendation in Alibaba. Experimental results demonstrate the superiority of the proposed model, which is then deployed online at Taobao and obtain significant improvements in online Click-Through-Rate (CTR) comparing to two baselines.
Industrial recommender systems usually consist of the matching stage and the ranking stage, in order to handle the billion-scale of users and items. The matching stage retrieves candidate items relevant to user interests, while the ranking stage sorts candidate items by user interests. Thus, the most critical ability is to model and represent user interests for either stage. Most of the existing deep learning-based models represent one user as a single vector which is insufficient to capture the varying nature of user's interests. In this paper, we approach this problem from a different view, to represent one user with multiple vectors encoding the different aspects of the user's interests. We propose the Multi-Interest Network with Dynamic routing (MIND) for dealing with user's diverse interests in the matching stage. Specifically, we design a multi-interest extractor layer based on capsule routing mechanism, which is applicable for clustering historical behaviors and extracting diverse interests. Furthermore, we develop a technique named label-aware attention to help learn a user representation with multiple vectors. Through extensive experiments on several public benchmarks and one large-scale industrial dataset from Tmall, we demonstrate that MIND can achieve superior performance than state-of-the-art methods for recommendation. Currently, MIND has been deployed for handling major online traffic at the homepage on Mobile Tmall App.