Hypergraph offers a framework to depict the multilateral relationships in real-world complex data. Predicting higher-order relationships, i.e hyperedge, becomes a fundamental problem for the full understanding of complicated interactions. The development of graph neural network (GNN) has greatly advanced the analysis of ordinary graphs with pair-wise relations. However, these methods could not be easily extended to the case of hypergraph. In this paper, we generalize the challenges of GNN in representing higher-order data in principle, which are edge- and node-level ambiguities. To overcome the challenges, we present SNALS that utilizes bipartite graph neural network with structural features to collectively tackle the two ambiguity issues. SNALS captures the joint interactions of a hyperedge by its local environment, which is retrieved by collecting the spectrum information of their connections. As a result, SNALS achieves nearly 30% performance increase compared with most recent GNN-based models. In addition, we applied SNALS to predict genetic higher-order interactions on 3D genome organization data. SNALS showed consistently high prediction accuracy across different chromosomes, and generated novel findings on 4-way gene interaction, which is further validated by existing literature.
Classical recommender system methods typically face the filter bubble problem when users only receive recommendations of their familiar items, making them bored and dissatisfied. To address the filter bubble problem, unexpected recommendations have been proposed to recommend items significantly deviating from user's prior expectations and thus surprising them by presenting "fresh" and previously unexplored items to the users. In this paper, we describe a novel Personalized Unexpected Recommender System (PURS) model that incorporates unexpectedness into the recommendation process by providing multi-cluster modeling of user interests in the latent space and personalized unexpectedness via the self-attention mechanism and via selection of an appropriate unexpected activation function. Extensive offline experiments on three real-world datasets illustrate that the proposed PURS model significantly outperforms the state-of-the-art baseline approaches in terms of both accuracy and unexpectedness measures. In addition, we conduct an online A/B test at a major video platform Alibaba-Youku, where our model achieves over 3\% increase in the average video view per user metric. The proposed model is in the process of being deployed by the company.
Cross domain recommender system constitutes a powerful method to tackle the cold-start and sparsity problem by aggregating and transferring user preferences across multiple category domains. Therefore, it has great potential to improve click-through-rate prediction performance in online commerce platforms having many domains of products. While several cross domain sequential recommendation models have been proposed to leverage information from a source domain to improve CTR predictions in a target domain, they did not take into account bidirectional latent relations of user preferences across source-target domain pairs. As such, they cannot provide enhanced cross-domain CTR predictions for both domains simultaneously. In this paper, we propose a novel approach to cross-domain sequential recommendations based on the dual learning mechanism that simultaneously transfers information between two related domains in an iterative manner until the learning process stabilizes. In particular, the proposed Dual Attentive Sequential Learning (DASL) model consists of two novel components Dual Embedding and Dual Attention, which jointly establish the two-stage learning process: we first construct dual latent embeddings that extract user preferences in both domains simultaneously, and subsequently provide cross-domain recommendations by matching the extracted latent embeddings with candidate items through dual-attention learning mechanism. We conduct extensive offline experiments on three real-world datasets to demonstrate the superiority of our proposed model, which significantly and consistently outperforms several state-of-the-art baselines across all experimental settings. We also conduct an online A/B test at a major video streaming platform Alibaba-Youku, where our proposed model significantly improves business performance over the latest production system in the company.
Cross domain recommender systems have been increasingly valuable for helping consumers identify useful items in different applications. However, existing cross-domain models typically require large number of overlap users, which can be difficult to obtain in some applications. In addition, they did not consider the duality structure of cross-domain recommendation tasks, thus failing to take into account bidirectional latent relations between users and items and achieve optimal recommendation performance. To address these issues, in this paper we propose a novel cross-domain recommendation model based on dual learning that transfers information between two related domains in an iterative manner until the learning process stabilizes. We develop a novel latent orthogonal mapping to extract user preferences over multiple domains while preserving relations between users across different latent spaces. Furthermore, we combine the dual learning method with the metric learning approach, which allows us to significantly reduce the required common user overlap across the two domains and leads to even better cross-domain recommendation performance. We test the proposed model on two large-scale industrial datasets and six domain pairs, demonstrating that it consistently and significantly outperforms all the state-of-the-art baselines. We also show that the proposed model works well with very few overlap users to obtain satisfying recommendation performance comparable to the state-of-the-art baselines that use many overlap users.
Due to the wide use of highly-valuable and large-scale deep neural networks (DNNs), it becomes crucial to protect the intellectual property of DNNs so that the ownership of disputed or stolen DNNs can be verified. Most existing solutions embed backdoors in DNN model training such that DNN ownership can be verified by triggering distinguishable model behaviors with a set of secret inputs. However, such solutions are vulnerable to model fine-tuning and pruning. They also suffer from fraudulent ownership claim as attackers can discover adversarial samples and use them as secret inputs to trigger distinguishable behaviors from stolen models. To address these problems, we propose a novel DNN watermarking solution, named HufuNet, for protecting the ownership of DNN models. We evaluate HufuNet rigorously on four benchmark datasets with five popular DNN models, including convolutional neural network (CNN) and recurrent neural network (RNN). The experiments demonstrate HufuNet is highly robust against model fine-tuning/pruning, kernels cutoff/supplement, functionality-equivalent attack, and fraudulent ownership claims, thus highly promising to protect large-scale DNN models in the real-world.
A plethora of real-world problems require utilization of hypergraphs and diffusion algorithms. Examples include recommendation systems, node ranking in food networks and community detection in social networks to mention a few. Due to the increased size and complexity of real hypergraphs, local and accurate diffusion algorithms that work with the most complex hypergraphs are in need. We propose the first local diffusion method that works on higher-order relations with only a submodularity assumption. Our method is based on a primal-dual optimization formulation where the primal problem has a natural network flow interpretation, and the dual problem has a cut-based interpretation using the $\ell_2$-norm penalty for general submodular cut-costs. We prove that the proposed formulation achieves quadratic approximation error for the problem of local hypergraph clustering. We demonstrate that the new technique is significantly better than state-of-the-art methods over a range of real datasets for the local hypergraph clustering and node ranking problems.
Temporal networks serve as abstractions of many real-world dynamic systems. These networks typically evolve according to certain laws, such as the law of triadic closure, which is universal in social networks. Inductive representation learning of temporal networks should be able to capture such laws and further be applied to systems that follow the same laws but have not been unseen during the training stage. Previous works in this area depend on either network node identities or rich edge attributes and typically fail to extract these laws. Here, we propose Causal Anonymous Walks (CAWs) to inductively represent a temporal network. CAWs are extracted by temporal random walks and work as automatic retrieval of temporal network motifs to represent network dynamics while avoiding the time-consuming selection and counting of those motifs. CAWs adopt a novel anonymization strategy that replaces node identities with the hitting counts of the nodes based on a set of sampled walks to keep the method inductive, and simultaneously establish the correlation between motifs. We further propose a neural-network model CAW-N to encode CAWs, and pair it with a CAW sampling strategy with constant memory and time cost to support online training and inference. CAW-N is evaluated to predict links over 6 real temporal networks and uniformly outperforms previous SOTA methods by averaged 15% AUC gain in the inductive setting. CAW-N also outperforms previous methods in 5 out of the 6 networks in the transductive setting.