Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Recommendation": models, code, and papers

Best Arm Identification with a Fixed Budget under a Small Gap

Feb 10, 2022
Masahiro Kato, Kaito Ariu, Masaaki Imaizumi, Masatoshi Uehara, Masahiro Nomura, Chao Qin

We consider the fixed-budget best arm identification problem in the multi-armed bandit problem. One of the main interests in this field is to derive a tight lower bound on the probability of misidentifying the best arm and to develop a strategy whose performance guarantee matches the lower bound. However, it has long been an open problem when the optimal allocation ratio of arm draws is unknown. In this paper, we provide an answer for this problem under which the gap between the expected rewards is small. First, we derive a tight problem-dependent lower bound, which characterizes the optimal allocation ratio that depends on the gap of the expected rewards and the Fisher information of the bandit model. Then, we propose the "RS-AIPW" strategy, which consists of the randomized sampling (RS) rule using the estimated optimal allocation ratio and the recommendation rule using the augmented inverse probability weighting (AIPW) estimator. Our proposed strategy is optimal in the sense that the performance guarantee achieves the derived lower bound under a small gap. In the course of the analysis, we present a novel large deviation bound for martingales.

  Access Paper or Ask Questions

Bayesian Promised Persuasion: Dynamic Forward-Looking Multiagent Delegation with Informational Burning

Jan 16, 2022
Tao Zhang, Quanyan Zhu

This work studies a dynamic mechanism design problem in which a principal delegates decision makings to a group of privately-informed agents without the monetary transfer or burning. We consider that the principal privately possesses complete knowledge about the state transitions and study how she can use her private observation to support the incentive compatibility of the delegation via informational burning, a process we refer to as the looking-forward persuasion. The delegation mechanism is formulated in which the agents form belief hierarchies due to the persuasion and play a dynamic Bayesian game. We propose a novel randomized mechanism, known as Bayesian promised delegation (BPD), in which the periodic incentive compatibility is guaranteed by persuasions and promises of future delegations. We show that the BPD can achieve the same optimal social welfare as the original mechanism in stationary Markov perfect Bayesian equilibria. A revelation-principle-like design regime is established to show that the persuasion with belief hierarchies can be fully characterized by correlating the randomization of the agents' local BPD mechanisms with the persuasion as a direct recommendation of the future promises.

  Access Paper or Ask Questions

KGE-CL: Contrastive Learning of Knowledge Graph Embeddings

Dec 09, 2021
Wentao Xu, Zhiping Luo, Weiqing Liu, Jiang Bian, Jian Yin, Tie-Yan Liu

Learning the embeddings of knowledge graphs is vital in artificial intelligence, and can benefit various downstream applications, such as recommendation and question answering. In recent years, many research efforts have been proposed for knowledge graph embedding. However, most previous knowledge graph embedding methods ignore the semantic similarity between the related entities and entity-relation couples in different triples since they separately optimize each triple with the scoring function. To address this problem, we propose a simple yet efficient contrastive learning framework for knowledge graph embeddings, which can shorten the semantic distance of the related entities and entity-relation couples in different triples and thus improve the expressiveness of knowledge graph embeddings. We evaluate our proposed method on three standard knowledge graph benchmarks. It is noteworthy that our method can yield some new state-of-the-art results, achieving 51.2% MRR, 46.8% [email protected] on the WN18RR dataset, and 59.1% MRR, 51.8% [email protected] on the YAGO3-10 dataset.

  Access Paper or Ask Questions

Unbiased Graph Embedding with Biased Graph Observations

Oct 29, 2021
Nan Wang, Lu Lin, Jundong Li, Hongning Wang

Graph embedding techniques have been increasingly employed in real-world machine learning tasks on graph-structured data, such as social recommendations and protein structure modeling. Since the generation of a graph is inevitably affected by some sensitive node attributes (such as gender and age of users in a social network), the learned graph representations can inherit such sensitive information and introduce undesirable biases in downstream tasks. Most existing works on debiasing graph representations add ad-hoc constraints on the learned embeddings to restrict their distributions, which however compromise the utility of resulting graph representations in downstream tasks. In this paper, we propose a principled new way for obtaining unbiased representations by learning from an underlying bias-free graph that is not influenced by sensitive attributes. Based on this new perspective, we propose two complementary methods for uncovering such an underlying graph with the goal of introducing minimum impact on the utility of learned representations in downstream tasks. Both our theoretical justification and extensive experiment comparisons against state-of-the-art solutions demonstrate the effectiveness of our proposed methods.

  Access Paper or Ask Questions

Spatio-Temporal Video Representation Learning for AI Based Video Playback Style Prediction

Oct 03, 2021
Rishubh Parihar, Gaurav Ramola, Ranajit Saha, Ravi Kini, Aniket Rege, Sudha Velusamy

Ever-increasing smartphone-generated video content demands intelligent techniques to edit and enhance videos on power-constrained devices. Most of the best performing algorithms for video understanding tasks like action recognition, localization, etc., rely heavily on rich spatio-temporal representations to make accurate predictions. For effective learning of the spatio-temporal representation, it is crucial to understand the underlying object motion patterns present in the video. In this paper, we propose a novel approach for understanding object motions via motion type classification. The proposed motion type classifier predicts a motion type for the video based on the trajectories of the objects present. Our classifier assigns a motion type for the given video from the following five primitive motion classes: linear, projectile, oscillatory, local and random. We demonstrate that the representations learned from the motion type classification generalizes well for the challenging downstream task of video retrieval. Further, we proposed a recommendation system for video playback style based on the motion type classifier predictions.

* 10 pages, 5 figures, 4 tables, ICCV Workshops 2021 - SRVU 

  Access Paper or Ask Questions

Machine Learning-Powered Mitigation Policy Optimization in Epidemiological Models

Oct 16, 2020
Jayaraman J. Thiagarajan, Peer-Timo Bremer, Rushil Anirudh, Timothy C. Germann, Sara Y. Del Valle, Frederick H. Streitz

A crucial aspect of managing a public health crisis is to effectively balance prevention and mitigation strategies, while taking their socio-economic impact into account. In particular, determining the influence of different non-pharmaceutical interventions (NPIs) on the effective use of public resources is an important problem, given the uncertainties on when a vaccine will be made available. In this paper, we propose a new approach for obtaining optimal policy recommendations based on epidemiological models, which can characterize the disease progression under different interventions, and a look-ahead reward optimization strategy to choose the suitable NPI at different stages of an epidemic. Given the time delay inherent in any epidemiological model and the exponential nature especially of an unmanaged epidemic, we find that such a look-ahead strategy infers non-trivial policies that adhere well to the constraints specified. Using two different epidemiological models, namely SEIR and EpiCast, we evaluate the proposed algorithm to determine the optimal NPI policy, under a constraint on the number of daily new cases and the primary reward being the absence of restrictions.

  Access Paper or Ask Questions

Statistical Inference for Online Decision-Making: In a Contextual Bandit Setting

Oct 14, 2020
Haoyu Chen, Wenbin Lu, Rui Song

Online decision-making problem requires us to make a sequence of decisions based on incremental information. Common solutions often need to learn a reward model of different actions given the contextual information and then maximize the long-term reward. It is meaningful to know if the posited model is reasonable and how the model performs in the asymptotic sense. We study this problem under the setup of the contextual bandit framework with a linear reward model. The $\varepsilon$-greedy policy is adopted to address the classic exploration-and-exploitation dilemma. Using the martingale central limit theorem, we show that the online ordinary least squares estimator of model parameters is asymptotically normal. When the linear model is misspecified, we propose the online weighted least squares estimator using the inverse propensity score weighting and also establish its asymptotic normality. Based on the properties of the parameter estimators, we further show that the in-sample inverse propensity weighted value estimator is asymptotically normal. We illustrate our results using simulations and an application to a news article recommendation dataset from Yahoo!.

* Accepted by the Journal of the American Statistical Association 

  Access Paper or Ask Questions

Connecting Web Event Forecasting with Anomaly Detection: A Case Study on Enterprise Web Applications Using Self-Supervised Neural Networks

Sep 07, 2020
Xiaoyong Yuan, Lei Ding, Malek Ben Salem, Xiaolin Li, Dapeng Wu

Recently web applications have been widely used in enterprises to assist employees in providing effective and efficient business processes. Forecasting upcoming web events in enterprise web applications can be beneficial in many ways, such as efficient caching and recommendation. In this paper, we present a web event forecasting approach, DeepEvent, in enterprise web applications for better anomaly detection. DeepEvent includes three key features: web-specific neural networks to take into account the characteristics of sequential web events, self-supervised learning techniques to overcome the scarcity of labeled data, and sequence embedding techniques to integrate contextual events and capture dependencies among web events. We evaluate DeepEvent on web events collected from six real-world enterprise web applications. Our experimental results demonstrate that DeepEvent is effective in forecasting sequential web events and detecting web based anomalies. DeepEvent provides a context-based system for researchers and practitioners to better forecast web events with situational awareness.

* accepted at EAI SecureComm 2020 

  Access Paper or Ask Questions

Towards Ecologically Valid Research on Language User Interfaces

Jul 28, 2020
Harm de Vries, Dzmitry Bahdanau, Christopher Manning

Language User Interfaces (LUIs) could improve human-machine interaction for a wide variety of tasks, such as playing music, getting insights from databases, or instructing domestic robots. In contrast to traditional hand-crafted approaches, recent work attempts to build LUIs in a data-driven way using modern deep learning methods. To satisfy the data needs of such learning algorithms, researchers have constructed benchmarks that emphasize the quantity of collected data at the cost of its naturalness and relevance to real-world LUI use cases. As a consequence, research findings on such benchmarks might not be relevant for developing practical LUIs. The goal of this paper is to bootstrap the discussion around this issue, which we refer to as the benchmarks' low ecological validity. To this end, we describe what we deem an ideal methodology for machine learning research on LUIs and categorize five common ways in which recent benchmarks deviate from it. We give concrete examples of the five kinds of deviations and their consequences. Lastly, we offer a number of recommendations as to how to increase the ecological validity of machine learning research on LUIs.

  Access Paper or Ask Questions