Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Recommendation": models, code, and papers

Adversarial Robustness of Probabilistic Network Embedding for Link Prediction

Jul 05, 2021
Xi Chen, Bo Kang, Jefrey Lijffijt, Tijl De Bie

In today's networked society, many real-world problems can be formalized as predicting links in networks, such as Facebook friendship suggestions, e-commerce recommendations, and the prediction of scientific collaborations in citation networks. Increasingly often, link prediction problem is tackled by means of network embedding methods, owing to their state-of-the-art performance. However, these methods lack transparency when compared to simpler baselines, and as a result their robustness against adversarial attacks is a possible point of concern: could one or a few small adversarial modifications to the network have a large impact on the link prediction performance when using a network embedding model? Prior research has already investigated adversarial robustness for network embedding models, focused on classification at the node and graph level. Robustness with respect to the link prediction downstream task, on the other hand, has been explored much less. This paper contributes to filling this gap, by studying adversarial robustness of Conditional Network Embedding (CNE), a state-of-the-art probabilistic network embedding model, for link prediction. More specifically, given CNE and a network, we measure the sensitivity of the link predictions of the model to small adversarial perturbations of the network, namely changes of the link status of a node pair. Thus, our approach allows one to identify the links and non-links in the network that are most vulnerable to such perturbations, for further investigation by an analyst. We analyze the characteristics of the most and least sensitive perturbations, and empirically confirm that our approach not only succeeds in identifying the most vulnerable links and non-links, but also that it does so in a time-efficient manner thanks to an effective approximation.

  Access Paper or Ask Questions

NewsBERT: Distilling Pre-trained Language Model for Intelligent News Application

Feb 09, 2021
Chuhan Wu, Fangzhao Wu, Yang Yu, Tao Qi, Yongfeng Huang, Qi Liu

Pre-trained language models (PLMs) like BERT have made great progress in NLP. News articles usually contain rich textual information, and PLMs have the potentials to enhance news text modeling for various intelligent news applications like news recommendation and retrieval. However, most existing PLMs are in huge size with hundreds of millions of parameters. Many online news applications need to serve millions of users with low latency tolerance, which poses huge challenges to incorporating PLMs in these scenarios. Knowledge distillation techniques can compress a large PLM into a much smaller one and meanwhile keeps good performance. However, existing language models are pre-trained and distilled on general corpus like Wikipedia, which has some gaps with the news domain and may be suboptimal for news intelligence. In this paper, we propose NewsBERT, which can distill PLMs for efficient and effective news intelligence. In our approach, we design a teacher-student joint learning and distillation framework to collaboratively learn both teacher and student models, where the student model can learn from the learning experience of the teacher model. In addition, we propose a momentum distillation method by incorporating the gradients of teacher model into the update of student model to better transfer useful knowledge learned by the teacher model. Extensive experiments on two real-world datasets with three tasks show that NewsBERT can effectively improve the model performance in various intelligent news applications with much smaller models.

  Access Paper or Ask Questions

Artificial Intelligence & Cooperation

Dec 10, 2020
Elisa Bertino, Finale Doshi-Velez, Maria Gini, Daniel Lopresti, David Parkes

The rise of Artificial Intelligence (AI) will bring with it an ever-increasing willingness to cede decision-making to machines. But rather than just giving machines the power to make decisions that affect us, we need ways to work cooperatively with AI systems. There is a vital need for research in "AI and Cooperation" that seeks to understand the ways in which systems of AIs and systems of AIs with people can engender cooperative behavior. Trust in AI is also key: trust that is intrinsic and trust that can only be earned over time. Here we use the term "AI" in its broadest sense, as employed by the recent 20-Year Community Roadmap for AI Research (Gil and Selman, 2019), including but certainly not limited to, recent advances in deep learning. With success, cooperation between humans and AIs can build society just as human-human cooperation has. Whether coming from an intrinsic willingness to be helpful, or driven through self-interest, human societies have grown strong and the human species has found success through cooperation. We cooperate "in the small" -- as family units, with neighbors, with co-workers, with strangers -- and "in the large" as a global community that seeks cooperative outcomes around questions of commerce, climate change, and disarmament. Cooperation has evolved in nature also, in cells and among animals. While many cases involving cooperation between humans and AIs will be asymmetric, with the human ultimately in control, AI systems are growing so complex that, even today, it is impossible for the human to fully comprehend their reasoning, recommendations, and actions when functioning simply as passive observers.

* A Computing Community Consortium (CCC) white paper, 4 pages 

  Access Paper or Ask Questions

Online Compact Convexified Factorization Machine

Feb 05, 2018
Wenpeng Zhang, Xiao Lin, Peilin Zhao

Factorization Machine (FM) is a supervised learning approach with a powerful capability of feature engineering. It yields state-of-the-art performance in various batch learning tasks where all the training data is made available prior to the training. However, in real-world applications where the data arrives sequentially in a streaming manner, the high cost of re-training with batch learning algorithms has posed formidable challenges in the online learning scenario. The initial challenge is that no prior formulations of FM could fulfill the requirements in Online Convex Optimization (OCO) -- the paramount framework for online learning algorithm design. To address the aforementioned challenge, we invent a new convexification scheme leading to a Compact Convexified FM (CCFM) that seamlessly meets the requirements in OCO. However for learning Compact Convexified FM (CCFM) in the online learning setting, most existing algorithms suffer from expensive projection operations. To address this subsequent challenge, we follow the general projection-free algorithmic framework of Online Conditional Gradient and propose an Online Compact Convex Factorization Machine (OCCFM) algorithm that eschews the projection operation with efficient linear optimization steps. In support of the proposed OCCFM in terms of its theoretical foundation, we prove that the developed algorithm achieves a sub-linear regret bound. To evaluate the empirical performance of OCCFM, we conduct extensive experiments on 6 real-world datasets for online recommendation and binary classification tasks. The experimental results show that OCCFM outperforms the state-of-art online learning algorithms.

* 11 pages, 2 figures, WWW2018 (Accepted) 

  Access Paper or Ask Questions

Learning Continuous User Representations through Hybrid Filtering with doc2vec

Dec 31, 2017
Simon Stiebellehner, Jun Wang, Shuai Yuan

Players in the online ad ecosystem are struggling to acquire the user data required for precise targeting. Audience look-alike modeling has the potential to alleviate this issue, but models' performance strongly depends on quantity and quality of available data. In order to maximize the predictive performance of our look-alike modeling algorithms, we propose two novel hybrid filtering techniques that utilize the recent neural probabilistic language model algorithm doc2vec. We apply these methods to data from a large mobile ad exchange and additional app metadata acquired from the Apple App store and Google Play store. First, we model mobile app users through their app usage histories and app descriptions (user2vec). Second, we introduce context awareness to that model by incorporating additional user and app-related metadata in model training (context2vec). Our findings are threefold: (1) the quality of recommendations provided by user2vec is notably higher than current state-of-the-art techniques. (2) User representations generated through hybrid filtering using doc2vec prove to be highly valuable features in supervised machine learning models for look-alike modeling. This represents the first application of hybrid filtering user models using neural probabilistic language models, specifically doc2vec, in look-alike modeling. (3) Incorporating context metadata in the doc2vec model training process to introduce context awareness has positive effects on performance and is superior to directly including the data as features in the downstream supervised models.

* 10 pages 

  Access Paper or Ask Questions

Anomaly Detection in Wireless Sensor Networks

Aug 27, 2017
Pelumi Oluwasanya

Wireless sensor networks usually comprise a large number of sensors monitoring changes in variables. These changes in variables represent changes in physical quantities. The changes can occur for various reasons; these reasons are highlighted in this work. Outliers are unusual measurements. Outliers are important; they are information-bearing occurrences. This work seeks to identify them based on an approach presented in [1]. A critical review of most previous works in this area has been presented in [2], and few more are considered here just to set the stage. The main work can be described as this; given a set of measurements from sensors that represent a normal situation, [1] proceeds by first estimating the probability density function (pdf) of the set using a data-split approach, then estimate the entropy of the set using the arithmetic mean as an approximation for the expectation. The increase in entropy that occurs when strange data is recorded is used to detect unusual measurements in the test set depending on the desired confidence interval or false alarm rate. The results presented in [1] have been confirmed for different test signals such as the Gaussian, Beta, in one dimension and beta in two dimensions, and a beta and uniform mixture distribution in two dimensions. Finally, the method was confirmed on real data and the results are presented. The major drawbacks of [1] were identified, and a method that seeks to mitigate this using the Bhattacharyya distance is presented. This method detects more subtle anomalies, especially the type that would pass as normal in [1]. Finally, recommendations for future research are presented: the subject of interpretability, especially for subtle measurements, being the most elusive as of today.

  Access Paper or Ask Questions

Retrospective Higher-Order Markov Processes for User Trails

Apr 20, 2017
Tao Wu, David Gleich

Users form information trails as they browse the web, checkin with a geolocation, rate items, or consume media. A common problem is to predict what a user might do next for the purposes of guidance, recommendation, or prefetching. First-order and higher-order Markov chains have been widely used methods to study such sequences of data. First-order Markov chains are easy to estimate, but lack accuracy when history matters. Higher-order Markov chains, in contrast, have too many parameters and suffer from overfitting the training data. Fitting these parameters with regularization and smoothing only offers mild improvements. In this paper we propose the retrospective higher-order Markov process (RHOMP) as a low-parameter model for such sequences. This model is a special case of a higher-order Markov chain where the transitions depend retrospectively on a single history state instead of an arbitrary combination of history states. There are two immediate computational advantages: the number of parameters is linear in the order of the Markov chain and the model can be fit to large state spaces. Furthermore, by providing a specific structure to the higher-order chain, RHOMPs improve the model accuracy by efficiently utilizing history states without risks of overfitting the data. We demonstrate how to estimate a RHOMP from data and we demonstrate the effectiveness of our method on various real application datasets spanning geolocation data, review sequences, and business locations. The RHOMP model uniformly outperforms higher-order Markov chains, Kneser-Ney regularization, and tensor factorizations in terms of prediction accuracy.

  Access Paper or Ask Questions

Greed is Good: Near-Optimal Submodular Maximization via Greedy Optimization

Apr 05, 2017
Moran Feldman, Christopher Harshaw, Amin Karbasi

It is known that greedy methods perform well for maximizing monotone submodular functions. At the same time, such methods perform poorly in the face of non-monotonicity. In this paper, we show - arguably, surprisingly - that invoking the classical greedy algorithm $O(\sqrt{k})$-times leads to the (currently) fastest deterministic algorithm, called Repeated Greedy, for maximizing a general submodular function subject to $k$-independent system constraints. Repeated Greedy achieves $(1 + O(1/\sqrt{k}))k$ approximation using $O(nr\sqrt{k})$ function evaluations (here, $n$ and $r$ denote the size of the ground set and the maximum size of a feasible solution, respectively). We then show that by a careful sampling procedure, we can run the greedy algorithm only once and obtain the (currently) fastest randomized algorithm, called Sample Greedy, for maximizing a submodular function subject to $k$-extendible system constraints (a subclass of $k$-independent system constrains). Sample Greedy achieves $(k + 3)$-approximation with only $O(nr/k)$ function evaluations. Finally, we derive an almost matching lower bound, and show that no polynomial time algorithm can have an approximation ratio smaller than $ k + 1/2 - \varepsilon$. To further support our theoretical results, we compare the performance of Repeated Greedy and Sample Greedy with prior art in a concrete application (movie recommendation). We consistently observe that while Sample Greedy achieves practically the same utility as the best baseline, it performs at least two orders of magnitude faster.

  Access Paper or Ask Questions

GIFT: Graph-guIded Feature Transfer for Cold-Start Video Click-Through Rate Prediction

Feb 21, 2022
Sihao Hu, Yi Cao, Yu Gong, Zhao Li, Yazheng Yang, Qingwen Liu, Wengwu Ou, Shouling Ji

Short video has witnessed rapid growth in China and shows a promising market for promoting the sales of products in e-commerce platforms like Taobao. To ensure the freshness of the content, the platform needs to release a large number of new videos every day, which makes the conventional click-through rate (CTR) prediction model suffer from the severe item cold-start problem. In this paper, we propose GIFT, an efficient Graph-guIded Feature Transfer system, to fully take advantages of the rich information of warmed-up videos that related to the cold-start video. More specifically, we conduct feature transfer from warmed-up videos to those cold-start ones by involving the physical and semantic linkages into a heterogeneous graph. The former linkages consist of those explicit relationships (e.g., sharing the same category, under the same authorship etc.), while the latter measure the proximity of multimodal representations of two videos. In practice, the style, content, and even the recommendation pattern are pretty similar among those physically or semantically related videos. Besides, in order to provide the robust id representations and historical statistics obtained from warmed-up neighbors that cold-start videos covet most, we elaborately design the transfer function to make aware of different transferred features from different types of nodes and edges along the metapath on the graph. Extensive experiments on a large real-world dataset show that our GIFT system outperforms SOTA methods significantly and brings a 6.82% lift on click-through rate (CTR) in the homepage of Taobao App.

  Access Paper or Ask Questions

Strategic Mitigation of Agent Inattention in Drivers with Open-Quantum Cognition Models

Jul 21, 2021
Qizi Zhang, Venkata Sriram Siddhardh Nadendla, S. N. Balakrishnan, Jerome Busemeyer

State-of-the-art driver-assist systems have failed to effectively mitigate driver inattention and had minimal impacts on the ever-growing number of road mishaps (e.g. life loss, physical injuries due to accidents caused by various factors that lead to driver inattention). This is because traditional human-machine interaction settings are modeled in classical and behavioral game-theoretic domains which are technically appropriate to characterize strategic interaction between either two utility maximizing agents, or human decision makers. Therefore, in an attempt to improve the persuasive effectiveness of driver-assist systems, we develop a novel strategic and personalized driver-assist system which adapts to the driver's mental state and choice behavior. First, we propose a novel equilibrium notion in human-system interaction games, where the system maximizes its expected utility and human decisions can be characterized using any general decision model. Then we use this novel equilibrium notion to investigate the strategic driver-vehicle interaction game where the car presents a persuasive recommendation to steer the driver towards safer driving decisions. We assume that the driver employs an open-quantum system cognition model, which captures complex aspects of human decision making such as violations to classical law of total probability and incompatibility of certain mental representations of information. We present closed-form expressions for players' final responses to each other's strategies so that we can numerically compute both pure and mixed equilibria. Numerical results are presented to illustrate both kinds of equilibria.

* 12 pages, 4 figures, submitted to IEEE Transactions on Human-Machine Systems 

  Access Paper or Ask Questions