Get our free extension to see links to code for papers anywhere online!

 Add to Chrome

 Add to Firefox

CatalyzeX Code Finder - Browser extension linking code for ML papers across the web! | Product Hunt Embed

Reward-Biased Maximum Likelihood Estimation for Linear Stochastic Bandits

Oct 08, 2020
Yu-Heng Hung, Ping-Chun Hsieh, Xi Liu, P. R. Kumar



Modifying the reward-biased maximum likelihood method originally proposed in the adaptive control literature, we propose novel learning algorithms to handle the explore-exploit trade-off in linear bandits problems as well as generalized linear bandits problems. We develop novel index policies that we prove achieve order-optimality, and show that they achieve empirical performance competitive with the state-of-the-art benchmark methods in extensive experiments. The new policies achieve this with low computation time per pull for linear bandits, and thereby resulting in both favorable regret as well as computational efficiency.



Share this with someone who'll enjoy it:

   Access Paper Source



Share this with someone who'll enjoy it: