Click-Through Rate (CTR) prediction, crucial in applications like recommender systems and online advertising, involves ranking items based on the likelihood of user clicks. User behavior sequence modeling has marked progress in CTR prediction, which extracts users' latent interests from their historical behavior sequences to facilitate accurate CTR prediction. Recent research explores using implicit feedback sequences, like unclicked records, to extract diverse user interests. However, these methods encounter key challenges: 1) temporal misalignment due to disparate sequence time ranges and 2) the lack of fine-grained interaction among feedback sequences. To address these challenges, we propose a novel framework called TEM4CTR, which ensures temporal alignment among sequences while leveraging auxiliary feedback information to enhance click behavior at the item level through a representation projection mechanism. Moreover, this projection-based information transfer module can effectively alleviate the negative impact of irrelevant or even potentially detrimental components of the auxiliary feedback information on the learning process of click behavior. Comprehensive experiments on public and industrial datasets confirm the superiority and effectiveness of TEM4CTR, showcasing the significance of temporal alignment in multi-feedback modeling.
Multi-behavior recommendation algorithms aim to leverage the multiplex interactions between users and items to learn users' latent preferences. Recent multi-behavior recommendation frameworks contain two steps: fusion and prediction. In the fusion step, advanced neural networks are used to model the hierarchical correlations between user behaviors. In the prediction step, multiple signals are utilized to jointly optimize the model with a multi-task learning (MTL) paradigm. However, recent approaches have not addressed the issue caused by imbalanced data distribution in the fusion step, resulting in the learned relationships being dominated by high-frequency behaviors. In the prediction step, the existing methods use a gate mechanism to directly aggregate expert information generated by coupling input, leading to negative information transfer. To tackle these issues, we propose a Parallel Knowledge Enhancement Framework (PKEF) for multi-behavior recommendation. Specifically, we enhance the hierarchical information propagation in the fusion step using parallel knowledge (PKF). Meanwhile, in the prediction step, we decouple the representations to generate expert information and introduce a projection mechanism during aggregation to eliminate gradient conflicts and alleviate negative transfer (PME). We conduct comprehensive experiments on three real-world datasets to validate the effectiveness of our model. The results further demonstrate the rationality and effectiveness of the designed PKF and PME modules. The source code and datasets are available at https://github.com/MC-CV/PKEF.
Multi-types of user behavior data (e.g., clicking, adding to cart, and purchasing) are recorded in most real-world recommendation scenarios, which can help to learn users' multi-faceted preferences. However, it is challenging to explore multi-behavior data due to the unbalanced data distribution and sparse target behavior, which lead to the inadequate modeling of high-order relations when treating multi-behavior data ''as features'' and gradient conflict in multitask learning when treating multi-behavior data ''as labels''. In this paper, we propose CIGF, a Compressed Interaction Graph based Framework, to overcome the above limitations. Specifically, we design a novel Compressed Interaction Graph Convolution Network (CIGCN) to model instance-level high-order relations explicitly. To alleviate the potential gradient conflict when treating multi-behavior data ''as labels'', we propose a Multi-Expert with Separate Input (MESI) network with separate input on the top of CIGCN for multi-task learning. Comprehensive experiments on three large-scale real-world datasets demonstrate the superiority of CIGF. Ablation studies and in-depth analysis further validate the effectiveness of our proposed model in capturing high-order relations and alleviating gradient conflict. The source code and datasets are available at https://github.com/MC-CV/CIGF.
* Wei Guo and Chang Meng are co-first authors and contributed equally
to this research. Chang Meng is supervised by Wei Guo when he was a research
intern at Huawei Noah's Ark Lab
Multi-types of behaviors (e.g., clicking, adding to cart, purchasing, etc.) widely exist in most real-world recommendation scenarios, which are beneficial to learn users' multi-faceted preferences. As dependencies are explicitly exhibited by the multiple types of behaviors, effectively modeling complex behavior dependencies is crucial for multi-behavior prediction. The state-of-the-art multi-behavior models learn behavior dependencies indistinguishably with all historical interactions as input. However, different behaviors may reflect different aspects of user preference, which means that some irrelevant interactions may play as noises to the target behavior to be predicted. To address the aforementioned limitations, we introduce multi-interest learning to the multi-behavior recommendation. More specifically, we propose a novel Coarse-to-fine Knowledge-enhanced Multi-interest Learning (CKML) framework to learn shared and behavior-specific interests for different behaviors. CKML introduces two advanced modules, namely Coarse-grained Interest Extracting (CIE) and Fine-grained Behavioral Correlation (FBC), which work jointly to capture fine-grained behavioral dependencies. CIE uses knowledge-aware information to extract initial representations of each interest. FBC incorporates a dynamic routing scheme to further assign each behavior among interests. Additionally, we use the self-attention mechanism to correlate different behavioral information at the interest level. Empirical results on three real-world datasets verify the effectiveness and efficiency of our model in exploiting multi-behavior data. Further experiments demonstrate the effectiveness of each module and the robustness and superiority of the shared and specific modelling paradigm for multi-behavior data.
* The self-attention mechanism should be contained in FBC and SRR is a
part of CIE, but we does not do this in our previous code implementation,
thus we redid the experiment and updated the experimental results, and tested
the significance of the ablation experiment. Besides, we corrected some of
the experimental details
This report summarizes the results of Learning to Understand Aerial Images (LUAI) 2021 challenge held on ICCV 2021, which focuses on object detection and semantic segmentation in aerial images. Using DOTA-v2.0 and GID-15 datasets, this challenge proposes three tasks for oriented object detection, horizontal object detection, and semantic segmentation of common categories in aerial images. This challenge received a total of 146 registrations on the three tasks. Through the challenge, we hope to draw attention from a wide range of communities and call for more efforts on the problems of learning to understand aerial images.