Near-Optimal Reinforcement Learning with Self-Play

Jul 14, 2020
Yu Bai, Chi Jin, Tiancheng Yu

A General Framework for Analyzing Stochastic Dynamics in Learning Algorithms

Jun 11, 2020
Chi-Ning Chou, Mien Brabeeba Wang, Tiancheng Yu

Reward-Free Exploration for Reinforcement Learning

Feb 07, 2020
Chi Jin, Akshay Krishnamurthy, Max Simchowitz, Tiancheng Yu

Learning Adversarial MDPs with Bandit Feedback and Unknown Transition

Jan 07, 2020
Chi Jin, Tiancheng Jin, Haipeng Luo, Suvrit Sra, Tiancheng Yu

* Improved the algorithm with a tighter confidence set 

Efficient Policy Learning for Non-Stationary MDPs under Adversarial Manipulation

Aug 21, 2019
Tiancheng Yu, Suvrit Sra

* There is a problem in the Theorem 1. We will try to fix it and update a new version 

Near Optimal Stratified Sampling

Jul 26, 2019
Tiancheng Yu, Xiyu Zhai, Suvrit Sra

* We have discovered a mistake in the main result. The quantity on the RHS of (3) is not equal to the variance of estimator (2) when the sampling rule is designed adaptively as we do. There will be further cross-product terms which are now dominant terms. Therefore, although our bound is correct for (3), it no longer implies bound of the variance of (2) 

Entropy Rate Estimation for Markov Chains with Large State Space

Sep 24, 2018
Yanjun Han, Jiantao Jiao, Chuan-Zheng Lee, Tsachy Weissman, Yihong Wu, Tiancheng Yu

* Published as a conference paper on NIPS 2018 

