Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Recommendation": models, code, and papers

Feedback-Based Tree Search for Reinforcement Learning

May 15, 2018
Daniel R. Jiang, Emmanuel Ekwedike, Han Liu

Inspired by recent successes of Monte-Carlo tree search (MCTS) in a number of artificial intelligence (AI) application domains, we propose a model-based reinforcement learning (RL) technique that iteratively applies MCTS on batches of small, finite-horizon versions of the original infinite-horizon Markov decision process. The terminal condition of the finite-horizon problems, or the leaf-node evaluator of the decision tree generated by MCTS, is specified using a combination of an estimated value function and an estimated policy function. The recommendations generated by the MCTS procedure are then provided as feedback in order to refine, through classification and regression, the leaf-node evaluator for the next iteration. We provide the first sample complexity bounds for a tree search-based RL algorithm. In addition, we show that a deep neural network implementation of the technique can create a competitive AI agent for the popular multi-player online battle arena (MOBA) game King of Glory.

* 19 pages, to be presented at ICML 2018 

  Access Paper or Ask Questions

Hybrid Metaheuristics for the Clustered Vehicle Routing Problem

Apr 26, 2014
Thibaut Vidal, Maria Battarra, Anand Subramanian, Güneş Erdoǧan

The Clustered Vehicle Routing Problem (CluVRP) is a variant of the Capacitated Vehicle Routing Problem in which customers are grouped into clusters. Each cluster has to be visited once, and a vehicle entering a cluster cannot leave it until all customers have been visited. This article presents two alternative hybrid metaheuristic algorithms for the CluVRP. The first algorithm is based on an Iterated Local Search algorithm, in which only feasible solutions are explored and problem-specific local search moves are utilized. The second algorithm is a Hybrid Genetic Search, for which the shortest Hamiltonian path between each pair of vertices within each cluster should be precomputed. Using this information, a sequence of clusters can be used as a solution representation and large neighborhoods can be efficiently explored by means of bi-directional dynamic programming, sequence concatenations, by using appropriate data structures. Extensive computational experiments are performed on benchmark instances from the literature, as well as new large scale ones. Recommendations on promising algorithm choices are provided relatively to average cluster size.

* Working Paper, MIT -- 22 pages 

  Access Paper or Ask Questions

Acoustical Quality Assessment of the Classroom Environment

Jan 13, 2012
Marian George, Moustafa Youssef

Teaching is one of the most important factors affecting any education system. Many research efforts have been conducted to facilitate the presentation modes used by instructors in classrooms as well as provide means for students to review lectures through web browsers. Other studies have been made to provide acoustical design recommendations for classrooms like room size and reverberation times. However, using acoustical features of classrooms as a way to provide education systems with feedback about the learning process was not thoroughly investigated in any of these studies. We propose a system that extracts different sound features of students and instructors, and then uses machine learning techniques to evaluate the acoustical quality of any learning environment. We infer conclusions about the students' satisfaction with the quality of lectures. Using classifiers instead of surveys and other subjective ways of measures can facilitate and speed such experiments which enables us to perform them continuously. We believe our system enables education systems to continuously review and improve their teaching strategies and acoustical quality of classrooms.

* 7 pages, technical report 

  Access Paper or Ask Questions

Should we tweet this? Generative response modeling for predicting reception of public health messaging on Twitter

Apr 09, 2022
Abraham Sanders, Debjani Ray-Majumder, John S. Erickson, Kristin P. Bennett

The way people respond to messaging from public health organizations on social media can provide insight into public perceptions on critical health issues, especially during a global crisis such as COVID-19. It could be valuable for high-impact organizations such as the US Centers for Disease Control and Prevention (CDC) or the World Health Organization (WHO) to understand how these perceptions impact reception of messaging on health policy recommendations. We collect two datasets of public health messages and their responses from Twitter relating to COVID-19 and Vaccines, and introduce a predictive method which can be used to explore the potential reception of such messages. Specifically, we harness a generative model (GPT-2) to directly predict probable future responses and demonstrate how it can be used to optimize expected reception of important health guidance. Finally, we introduce a novel evaluation scheme with extensive statistical testing which allows us to conclude that our models capture the semantics and sentiment found in actual public health responses.

* Accepted at ACM WebSci 2022 

  Access Paper or Ask Questions

Fast online inference for nonlinear contextual bandit based on Generative Adversarial Network

Feb 17, 2022
Yun Da Tsai, Shou De Lin

This work addresses the efficiency concern on inferring a nonlinear contextual bandit when the number of arms $n$ is very large. We propose a neural bandit model with an end-to-end training process to efficiently perform bandit algorithms such as Thompson Sampling and UCB during inference. We advance state-of-the-art time complexity to $O(\log n)$ with approximate Bayesian inference, neural random feature mapping, approximate global maxima and approximate nearest neighbor search. We further propose a generative adversarial network to shift the bottleneck of maximizing the objective for selecting optimal arms from inference time to training time, enjoying significant speedup with additional advantage of enabling batch and parallel processing. %The generative model can inference an approximate argmax of the posterior sampling in logarithmic time complexity with the help of approximate nearest neighbor search. Extensive experiments on classification and recommendation tasks demonstrate order-of-magnitude improvement in inference time no significant degradation on the performance.

  Access Paper or Ask Questions

Multiway Spherical Clustering via Degree-Corrected Tensor Block Models

Jan 19, 2022
Jiaxin Hu, Miaoyan Wang

We consider the problem of multiway clustering in the presence of unknown degree heterogeneity. Such data problems arise commonly in applications such as recommendation system, neuroimaging, community detection, and hypergraph partitions in social networks. The allowance of degree heterogeneity provides great flexibility in clustering models, but the extra complexity poses significant challenges in both statistics and computation. Here, we develop a degree-corrected tensor block model with estimation accuracy guarantees. We present the phase transition of clustering performance based on the notion of angle separability, and we characterize three signal-to-noise regimes corresponding to different statistical-computational behaviors. In particular, we demonstrate that an intrinsic statistical-to-computational gap emerges only for tensors of order three or greater. Further, we develop an efficient polynomial-time algorithm that provably achieves exact clustering under mild signal conditions. The efficacy of our procedure is demonstrated through two data applications, one on human brain connectome project, and another on Peru Legislation network dataset.

  Access Paper or Ask Questions

A Survey on AI Assurance

Nov 15, 2021
Feras A. Batarseh, Laura Freeman

Artificial Intelligence (AI) algorithms are increasingly providing decision making and operational support across multiple domains. AI includes a wide library of algorithms for different problems. One important notion for the adoption of AI algorithms into operational decision process is the concept of assurance. The literature on assurance, unfortunately, conceals its outcomes within a tangled landscape of conflicting approaches, driven by contradicting motivations, assumptions, and intuitions. Accordingly, albeit a rising and novel area, this manuscript provides a systematic review of research works that are relevant to AI assurance, between years 1985 - 2021, and aims to provide a structured alternative to the landscape. A new AI assurance definition is adopted and presented and assurance methods are contrasted and tabulated. Additionally, a ten-metric scoring system is developed and introduced to evaluate and compare existing methods. Lastly, in this manuscript, we provide foundational insights, discussions, future directions, a roadmap, and applicable recommendations for the development and deployment of AI assurance.

* J Big Data 8, 60 (2021) 
* This paper is published at Springer's Journal of Big Data 

  Access Paper or Ask Questions

Noisy Channel Language Model Prompting for Few-Shot Text Classification

Aug 15, 2021
Sewon Min, Mike Lewis, Hannaneh Hajishirzi, Luke Zettlemoyer

We introduce a noisy channel approach for language model prompting in few-shot text classification. Instead of computing the likelihood of the label given the input (referred as direct models), channel models compute the conditional probability of the input given the label, and are thereby required to explain every word in the input. We use channel models for recently proposed few-shot learning methods with no or very limited updates to the language model parameters, via either in-context demonstration or prompt tuning. Our experiments show that, for both methods, channel models significantly outperform their direct counterparts, which we attribute to their stability, i.e., lower variance and higher worst-case accuracy. We also present extensive ablations that provide recommendations for when to use channel prompt tuning instead of other competitive models (e.g., direct head tuning): channel prompt tuning is preferred when the number of training examples is small, labels in the training data are imbalanced, or generalization to unseen labels is required.

* 15 pages, 6 figures 

  Access Paper or Ask Questions

The Bias-Variance Tradeoff of Doubly Robust Estimator with Targeted $L_1$ regularized Neural Networks Predictions

Aug 02, 2021
Mehdi Rostami, Olli Saarela, Michael Escobar

The Doubly Robust (DR) estimation of ATE can be carried out in 2 steps, where in the first step, the treatment and outcome are modeled, and in the second step the predictions are inserted into the DR estimator. The model misspecification in the first step has led researchers to utilize Machine Learning algorithms instead of parametric algorithms. However, existence of strong confounders and/or Instrumental Variables (IVs) can lead the complex ML algorithms to provide perfect predictions for the treatment model which can violate the positivity assumption and elevate the variance of DR estimators. Thus the ML algorithms must be controlled to avoid perfect predictions for the treatment model while still learn the relationship between the confounders and the treatment and outcome. We use two Neural network architectures and investigate how their hyperparameters should be tuned in the presence of confounders and IVs to achieve a low bias-variance tradeoff for ATE estimators such as DR estimator. Through simulation results, we will provide recommendations as to how NNs can be employed for ATE estimation.

  Access Paper or Ask Questions