Item recommendation based on historical user-item interactions is of vital importance for web-based services. However, the data used to train a recommender system (RS) suffers from severe popularity bias. Interactions of a small fraction of popular (head) items account for almost the whole training data. Normal training methods from such biased data tend to repetitively generate recommendations from the head items, which further exacerbates the data bias and affects the exploration of potentially interesting items from niche (tail) items. In this paper, we explore the central theme of long-tail recommendation. Through an empirical study, we find that head items are very likely to be recommended due to the fact that the gradients coming from head items dominate the overall gradient update process, which further affects the optimization of tail items. To this end, we propose a general framework namely Item Cluster-Wise Multi-Objective Training (ICMT) for long-tail recommendation. Firstly, the disentangled representation learning is utilized to identify the popularity impact behind user-item interactions. Then item clusters are adaptively formulated according to the disentangled popularity representation. After that, we consider the learning over the whole training data as a weighted aggregation of multiple item cluster-wise objectives, which can be resolved through a Pareto-Efficient solver for a harmonious overall gradient direction. Besides, a contractive loss focusing on model robustness is derived as a regularization term. We instantiate ICMT with three state-of-the-art recommendation models and conduct experiments on three real-world datasets. %Through alleviating the popularity bias, Experimental results demonstrate that the proposed ICMT significantly improves the overall recommendation performance, especially on tail items.
In this paper, we propose a novel model named DemiNet (short for DEpendency-Aware Multi-Interest Network}) to address the above two issues. To be specific, we first consider various dependency types between item nodes and perform dependency-aware heterogeneous attention for denoising and obtaining accurate sequence item representations. Secondly, for multiple interests extraction, multi-head attention is conducted on top of the graph embedding. To filter out noisy inter-item correlations and enhance the robustness of extracted interests, self-supervised interest learning is introduced to the above two steps. Thirdly, to aggregate the multiple interests, interest experts corresponding to different interest routes give rating scores respectively, while a specialized network assigns the confidence of each score. Experimental results on three real-world datasets demonstrate that the proposed DemiNet significantly improves the overall recommendation performance over several state-of-the-art baselines. Further studies verify the efficacy and interpretability benefits brought from the fine-grained user interest modeling.
E-commerce platforms usually display a mixed list of ads and organic items in feed. One key problem is to allocate the limited slots in the feed to maximize the overall revenue as well as improve user experience, which requires a good model for user preference. Instead of modeling the influence of individual items on user behaviors, the arrangement signal models the influence of the arrangement of items and may lead to a better allocation strategy. However, most of previous strategies fail to model such a signal and therefore result in suboptimal performance. To this end, we propose Cross Deep Q Network (Cross DQN) to extract the arrangement signal by crossing the embeddings of different items and processing the crossed sequence in the feed. Our model results in higher revenue and better user experience than state-of-the-art baselines in offline experiments. Moreover, our model demonstrates a significant improvement in the online A/B test and has been fully deployed on Meituan feed to serve more than 300 millions of customers.
Archetypal analysis is an unsupervised learning method for exploratory data analysis. One major challenge that limits the applicability of archetypal analysis in practice is the inherent computational complexity of the existing algorithms. In this paper, we provide a novel approximation approach to partially address this issue. Utilizing probabilistic ideas from high-dimensional geometry, we introduce two preprocessing techniques to reduce the dimension and representation cardinality of the data, respectively. We prove that, provided the data is approximately embedded in a low-dimensional linear subspace and the convex hull of the corresponding representations is well approximated by a polytope with a few vertices, our method can effectively reduce the scaling of archetypal analysis. Moreover, the solution of the reduced problem is near-optimal in terms of prediction errors. Our approach can be combined with other acceleration techniques to further mitigate the intrinsic complexity of archetypal analysis. We demonstrate the usefulness of our results by applying our method to summarize several moderately large-scale datasets.
Ranking models have achieved promising results, but it remains challenging to design personalized ranking systems to leverage user profiles and semantic representations between queries and documents. In this paper, we propose a topic-based personalized ranking model (TPRM) that integrates user topical profile with pretrained contextualized term representations to tailor the general document ranking list. Experiments on the real-world dataset demonstrate that TPRM outperforms state-of-the-art ad-hoc ranking models and personalized ranking models significantly.
Query expansion with pseudo-relevance feedback (PRF) is a powerful approach to enhance the effectiveness in information retrieval. Recently, with the rapid advance of deep learning techniques, neural text generation has achieved promising success in many natural language tasks. To leverage the strength of text generation for information retrieval, in this article, we propose a novel approach which effectively integrates text generation models into PRF-based query expansion. In particular, our approach generates augmented query terms via neural text generation models conditioned on both the initial query and pseudo-relevance feedback. Moreover, in order to train the generative model, we adopt the conditional generative adversarial nets (CGANs) and propose the PRF-CGAN method in which both the generator and the discriminator are conditioned on the pseudo-relevance feedback. We evaluate the performance of our approach on information retrieval tasks using two benchmark datasets. The experimental results show that our approach achieves comparable performance or outperforms traditional query expansion methods on both the retrieval and reranking tasks.
Deep learning based visual trackers entail offline pre-training on large volumes of video datasets with accurate bounding box annotations that are labor-expensive to achieve. We present a new framework to facilitate bounding box annotations for video sequences, which investigates a selection-and-refinement strategy to automatically improve the preliminary annotations generated by tracking algorithms. A temporal assessment network (T-Assess Net) is proposed which is able to capture the temporal coherence of target locations and select reliable tracking results by measuring their quality. Meanwhile, a visual-geometry refinement network (VG-Refine Net) is also designed to further enhance the selected tracking results by considering both target appearance and temporal geometry constraints, allowing inaccurate tracking results to be corrected. The combination of the above two networks provides a principled approach to ensure the quality of automatic video annotation. Experiments on large scale tracking benchmarks demonstrate that our method can deliver highly accurate bounding box annotations and significantly reduce human labor by 94.0%, yielding an effective means to further boost tracking performance with augmented training data.
Dynamic quadrupedal locomotion over rough terrains reveals remarkable progress over the last few decades. Small-scale quadruped robots are adequately flexible and adaptable to traverse uneven terrains along sagittal direction, such as slopes and stairs. To accomplish autonomous locomotion navigation in complex environments, spinning is a fundamental yet indispensable functionality for legged robots. However, spinning behaviors of quadruped robots on uneven terrain often exhibit position drifts. Motivated by this problem, this study presents an algorithmic method to enable accurate spinning motions over uneven terrain and constrain the spinning radius of the Center of Mass (CoM) to be bounded within a small range to minimize the drift risks. A modified spherical foot kinematics representation is proposed to improve the foot kinematic model and rolling dynamics of the quadruped during locomotion. A CoM planner is proposed to generate stable spinning motion based on projected stability margins. Accurate motion tracking is accomplished with Linear Quadratic Regulator (LQR) to bound the position drift during the spinning movement. Experiments are conducted on a small-scale quadruped robot and the effectiveness of the proposed method is verified on versatile terrains including flat ground, stairs and slopes.