Picture for Liyu Chen

Liyu Chen

Collaboration of Teachers for Semi-supervised Object Detection

May 22, 2024
Viaarxiv icon

$\mathbf{(N,K)}$-Puzzle: A Cost-Efficient Testbed for Benchmarking Reinforcement Learning Algorithms in Generative Language Model

Mar 11, 2024
Figure 1 for $\mathbf{(N,K)}$-Puzzle: A Cost-Efficient Testbed for Benchmarking Reinforcement Learning Algorithms in Generative Language Model
Figure 2 for $\mathbf{(N,K)}$-Puzzle: A Cost-Efficient Testbed for Benchmarking Reinforcement Learning Algorithms in Generative Language Model
Figure 3 for $\mathbf{(N,K)}$-Puzzle: A Cost-Efficient Testbed for Benchmarking Reinforcement Learning Algorithms in Generative Language Model
Figure 4 for $\mathbf{(N,K)}$-Puzzle: A Cost-Efficient Testbed for Benchmarking Reinforcement Learning Algorithms in Generative Language Model
Viaarxiv icon

$\mathcal{B}$-Coder: Value-Based Deep Reinforcement Learning for Program Synthesis

Oct 04, 2023
Figure 1 for $\mathcal{B}$-Coder: Value-Based Deep Reinforcement Learning for Program Synthesis
Figure 2 for $\mathcal{B}$-Coder: Value-Based Deep Reinforcement Learning for Program Synthesis
Figure 3 for $\mathcal{B}$-Coder: Value-Based Deep Reinforcement Learning for Program Synthesis
Figure 4 for $\mathcal{B}$-Coder: Value-Based Deep Reinforcement Learning for Program Synthesis
Viaarxiv icon

Layered State Discovery for Incremental Autonomous Exploration

Feb 07, 2023
Figure 1 for Layered State Discovery for Incremental Autonomous Exploration
Figure 2 for Layered State Discovery for Incremental Autonomous Exploration
Viaarxiv icon

Reaching Goals is Hard: Settling the Sample Complexity of the Stochastic Shortest Path

Oct 10, 2022
Figure 1 for Reaching Goals is Hard: Settling the Sample Complexity of the Stochastic Shortest Path
Figure 2 for Reaching Goals is Hard: Settling the Sample Complexity of the Stochastic Shortest Path
Figure 3 for Reaching Goals is Hard: Settling the Sample Complexity of the Stochastic Shortest Path
Viaarxiv icon

Follow-the-Perturbed-Leader for Adversarial Markov Decision Processes with Bandit Feedback

May 26, 2022
Figure 1 for Follow-the-Perturbed-Leader for Adversarial Markov Decision Processes with Bandit Feedback
Viaarxiv icon

Near-Optimal Goal-Oriented Reinforcement Learning in Non-Stationary Environments

May 25, 2022
Viaarxiv icon

Policy Learning and Evaluation with Randomized Quasi-Monte Carlo

Feb 21, 2022
Figure 1 for Policy Learning and Evaluation with Randomized Quasi-Monte Carlo
Figure 2 for Policy Learning and Evaluation with Randomized Quasi-Monte Carlo
Figure 3 for Policy Learning and Evaluation with Randomized Quasi-Monte Carlo
Figure 4 for Policy Learning and Evaluation with Randomized Quasi-Monte Carlo
Viaarxiv icon

Policy Optimization for Stochastic Shortest Path

Feb 07, 2022
Viaarxiv icon

Learning Infinite-Horizon Average-Reward Markov Decision Processes with Constraints

Jan 31, 2022
Viaarxiv icon