Picture for Xun Yu Zhou

Xun Yu Zhou

Reward-Directed Score-Based Diffusion Models via q-Learning

Add code
Sep 07, 2024
Viaarxiv icon

Sublinear Regret for An Actor-Critic Algorithm in Continuous-Time Linear-Quadratic Reinforcement Learning

Add code
Jul 24, 2024
Viaarxiv icon

Reinforcement Learning for Jump-Diffusions

Add code
May 26, 2024
Viaarxiv icon

Learning Merton's Strategies in an Incomplete Market: Recursive Entropy Regularization and Biased Gaussian Exploration

Add code
Dec 19, 2023
Viaarxiv icon

Variable Clustering via Distributionally Robust Nodewise Regression

Add code
Dec 21, 2022
Viaarxiv icon

Square-root regret bounds for continuous-time episodic Markov decision processes

Add code
Oct 03, 2022
Figure 1 for Square-root regret bounds for continuous-time episodic Markov decision processes
Viaarxiv icon

Choquet regularization for reinforcement learning

Add code
Aug 17, 2022
Figure 1 for Choquet regularization for reinforcement learning
Figure 2 for Choquet regularization for reinforcement learning
Viaarxiv icon

q-Learning in Continuous Time

Add code
Jul 02, 2022
Figure 1 for q-Learning in Continuous Time
Figure 2 for q-Learning in Continuous Time
Figure 3 for q-Learning in Continuous Time
Figure 4 for q-Learning in Continuous Time
Viaarxiv icon

Logarithmic regret bounds for continuous-time average-reward Markov decision processes

Add code
May 24, 2022
Viaarxiv icon

Policy Gradient and Actor-Critic Learning in Continuous Time and Space: Theory and Algorithms

Add code
Nov 22, 2021
Figure 1 for Policy Gradient and Actor-Critic Learning in Continuous Time and Space: Theory and Algorithms
Figure 2 for Policy Gradient and Actor-Critic Learning in Continuous Time and Space: Theory and Algorithms
Viaarxiv icon