Recommender systems are an essential part of any e-commerce platform. Recommendations are typically generated by aggregating large amounts of user data. A malicious actor may be motivated to sway the output of such recommender systems by injecting malicious datapoints to leverage the system for financial gain. In this work, we propose a semi-supervised attack detection algorithm to identify the malicious datapoints. We do this by leveraging a portion of the dataset that has a lower chance of being polluted to learn the distribution of genuine datapoints. Our proposed approach modifies the Generative Adversarial Network architecture to take into account the contextual information from user activity. This allows the model to distinguish legitimate datapoints from the injected ones.
Recommender Systems have become an integral part of online e-Commerce platforms, driving customer engagement and revenue. Most popular recommender systems attempt to learn from users' past engagement data to understand behavioral traits of users and use that to predict future behavior. In this work, we present an approach to use causal inference to learn user-attribute affinities through temporal contexts. We formulate this objective as a Probabilistic Machine Learning problem and apply a variational inference based method to estimate the model parameters. We demonstrate the performance of the proposed method on the next attribute prediction task on two real world datasets and show that it outperforms standard baseline methods.
Searching for and making decisions about products is becoming increasingly easier in the e-commerce space, thanks to the evolution of recommender systems. Personalization and recommender systems have gone hand-in-hand to help customers fulfill their shopping needs and improve their experiences in the process. With the growing adoption of conversational platforms for shopping, it has become important to build personalized models at scale to handle the large influx of data and perform inference in real-time. In this work, we present an end-to-end machine learning system for personalized conversational voice commerce. We include components for implicit feedback to the model, model training, evaluation on update, and a real-time inference engine. Our system personalizes voice shopping for Walmart Grocery customers and is currently available via Google Assistant, Siri and Google Home devices.
The problem of basket recommendation~(BR) is to recommend a ranking list of items to the current basket. Existing methods solve this problem by assuming the items within the same basket are correlated by one semantic relation, thus optimizing the item embeddings. However, this assumption breaks when there exist multiple intents within a basket. For example, assuming a basket contains \{\textit{bread, cereal, yogurt, soap, detergent}\} where \{\textit{bread, cereal, yogurt}\} are correlated through the "breakfast" intent, while \{\textit{soap, detergent}\} are of "cleaning" intent, ignoring multiple relations among the items spoils the ability of the model to learn the embeddings. To resolve this issue, it is required to discover the intents within the basket. However, retrieving a multi-intent pattern is rather challenging, as intents are latent within the basket. Additionally, intents within the basket may also be correlated. Moreover, discovering a multi-intent pattern requires modeling high-order interactions, as the intents across different baskets are also correlated. To this end, we propose a new framework named as \textbf{M}ulti-\textbf{I}ntent \textbf{T}ranslation \textbf{G}raph \textbf{N}eural \textbf{N}etwork~({\textbf{MITGNN}}). MITGNN models $T$ intents as tail entities translated from one corresponding basket embedding via $T$ relation vectors. The relation vectors are learned through multi-head aggregators to handle user and item information. Additionally, MITGNN propagates multiple intents across our defined basket graph to learn the embeddings of users and items by aggregating neighbors. Extensive experiments on two real-world datasets prove the effectiveness of our proposed model on both transductive and inductive BR. The code is available online at https://github.com/JimLiu96/MITGNN.
Through recent advancements in speech technology and introduction of smart devices, such as Amazon Alexa and Google Home, increasing number of users are interacting with applications through voice. E-commerce companies typically display short product titles on their webpages, either human-curated or algorithmically generated, when brevity is required, but these titles are dissimilar from natural spoken language. For example, "Lucky Charms Gluten Free Break-fast Cereal, 20.5 oz a box Lucky Charms Gluten Free" is acceptable to display on a webpage, but "a 20.5 ounce box of lucky charms gluten free cereal" is easier to comprehend over a conversational system. As compared to display devices, where images and detailed product information can be presented to users, short titles for products are necessary when interfacing with voice assistants. We propose a sequence-to-sequence approach using BERT to generate short, natural, spoken language titles from input web titles. Our extensive experiments on a real-world industry dataset and human evaluation of model outputs, demonstrate that BERT summarization outperforms comparable baseline models.
Inductive representation learning on temporal graphs is an important step toward salable machine learning on real-world dynamic networks. The evolving nature of temporal dynamic graphs requires handling new nodes as well as capturing temporal patterns. The node embeddings, which are now functions of time, should represent both the static node features and the evolving topological structures. Moreover, node and topological features can be temporal as well, whose patterns the node embeddings should also capture. We propose the temporal graph attention (TGAT) layer to efficiently aggregate temporal-topological neighborhood features as well as to learn the time-feature interactions. For TGAT, we use the self-attention mechanism as building block and develop a novel functional time encoding technique based on the classical Bochner's theorem from harmonic analysis. By stacking TGAT layers, the network recognizes the node embeddings as functions of time and is able to inductively infer embeddings for both new and observed nodes as the graph evolves. The proposed approach handles both node classification and link prediction task, and can be naturally extended to include the temporal edge features. We evaluate our method with transductive and inductive tasks under temporal settings with two benchmark and one industrial dataset. Our TGAT model compares favorably to state-of-the-art baselines as well as the previous temporal graph embedding approaches.
Within-basket recommendation reduces the exploration time of users, where the user's intention of the basket matters. The intent of a shopping basket can be retrieved from both user-item collaborative filtering signals and multi-item correlations. By defining a basket entity to represent the basket intent, we can model this problem as a basket-item link prediction task in the User-Basket-Item~(UBI) graph. Previous work solves the problem by leveraging user-item interactions and item-item interactions simultaneously. However, collectivity and heterogeneity characteristics are hardly investigated before. Collectivity defines the semantics of each node which should be aggregated from both directly and indirectly connected neighbors. Heterogeneity comes from multi-type interactions as well as multi-type nodes in the UBI graph. To this end, we propose a new framework named \textbf{BasConv}, which is based on the graph convolutional neural network. Our BasConv model has three types of aggregators specifically designed for three types of nodes. They collectively learn node embeddings from both neighborhood and high-order context. Additionally, the interactive layers in the aggregators can distinguish different types of interactions. Extensive experiments on two real-world datasets prove the effectiveness of BasConv.
Sequential modelling with self-attention has achieved cutting edge performances in natural language processing. With advantages in model flexibility, computation complexity and interpretability, self-attention is gradually becoming a key component in event sequence models. However, like most other sequence models, self-attention does not account for the time span between events and thus captures sequential signals rather than temporal patterns. Without relying on recurrent network structures, self-attention recognizes event orderings via positional encoding. To bridge the gap between modelling time-independent and time-dependent event sequence, we introduce a functional feature map that embeds time span into high-dimensional spaces. By constructing the associated translation-invariant time kernel function, we reveal the functional forms of the feature map under classic functional function analysis results, namely Bochner's Theorem and Mercer's Theorem. We propose several models to learn the functional time representation and the interactions with event representation. These methods are evaluated on real-world datasets under various continuous-time event sequence prediction tasks. The experiments reveal that the proposed methods compare favorably to baseline models while also capturing useful time-event interactions.
In this paper, we propose a new product knowledge graph (PKG) embedding approach for learning the intrinsic product relations as product knowledge for e-commerce. We define the key entities and summarize the pivotal product relations that are critical for general e-commerce applications including marketing, advertisement, search ranking and recommendation. We first provide a comprehensive comparison between PKG and ordinary knowledge graph (KG) and then illustrate why KG embedding methods are not suitable for PKG learning. We construct a self-attention-enhanced distributed representation learning model for learning PKG embeddings from raw customer activity data in an end-to-end fashion. We design an effective multi-task learning schema to fully leverage the multi-modal e-commerce data. The Poincare embedding is also employed to handle complex entity structures. We use a real-world dataset from grocery.walmart.com to evaluate the performances on knowledge completion, search ranking and recommendation. The proposed approach compares favourably to baselines in knowledge completion and downstream tasks.
With growing consumer adoption of online grocery shopping through platforms such as Amazon Fresh, Instacart, and Walmart Grocery, there is a pressing business need to provide relevant recommendations throughout the customer journey. In this paper, we introduce a production within-basket grocery recommendation system, RTT2Vec, which generates real-time personalized product recommendations to supplement the user's current grocery basket. We conduct extensive offline evaluation of our system and demonstrate a 9.4% uplift in prediction metrics over baseline state-of-the-art within-basket recommendation models. We also propose an approximate inference technique 11.6x times faster than exact inference approaches. In production, our system has resulted in an increase in average basket size, improved product discovery, and enabled faster user check-out