Alert button
Picture for Yihan Du

Yihan Du

Alert button

Exploration-Driven Policy Optimization in RLHF: Theoretical Insights on Efficient Data Utilization

Add code
Bookmark button
Alert button
Feb 15, 2024
Yihan Du, Anna Winnicki, Gal Dalal, Shie Mannor, R. Srikant

Viaarxiv icon

Cascading Reinforcement Learning

Add code
Bookmark button
Alert button
Jan 17, 2024
Yihan Du, R. Srikant, Wei Chen

Viaarxiv icon

Provably Efficient Iterated CVaR Reinforcement Learning with Function Approximation

Add code
Bookmark button
Alert button
Jul 06, 2023
Yu Chen, Yihan Du, Pihe Hu, Siwei Wang, Desheng Wu, Longbo Huang

Figure 1 for Provably Efficient Iterated CVaR Reinforcement Learning with Function Approximation
Figure 2 for Provably Efficient Iterated CVaR Reinforcement Learning with Function Approximation
Figure 3 for Provably Efficient Iterated CVaR Reinforcement Learning with Function Approximation
Figure 4 for Provably Efficient Iterated CVaR Reinforcement Learning with Function Approximation
Viaarxiv icon

Multi-task Representation Learning for Pure Exploration in Linear Bandits

Add code
Bookmark button
Alert button
Feb 09, 2023
Yihan Du, Longbo Huang, Wen Sun

Figure 1 for Multi-task Representation Learning for Pure Exploration in Linear Bandits
Viaarxiv icon

Dueling Bandits: From Two-dueling to Multi-dueling

Add code
Bookmark button
Alert button
Nov 16, 2022
Yihan Du, Siwei Wang, Longbo Huang

Figure 1 for Dueling Bandits: From Two-dueling to Multi-dueling
Figure 2 for Dueling Bandits: From Two-dueling to Multi-dueling
Viaarxiv icon

Risk-Sensitive Reinforcement Learning: Iterated CVaR and the Worst Path

Add code
Bookmark button
Alert button
Jun 06, 2022
Yihan Du, Siwei Wang, Longbo Huang

Figure 1 for Risk-Sensitive Reinforcement Learning: Iterated CVaR and the Worst Path
Figure 2 for Risk-Sensitive Reinforcement Learning: Iterated CVaR and the Worst Path
Figure 3 for Risk-Sensitive Reinforcement Learning: Iterated CVaR and the Worst Path
Figure 4 for Risk-Sensitive Reinforcement Learning: Iterated CVaR and the Worst Path
Viaarxiv icon

Branching Reinforcement Learning

Add code
Bookmark button
Alert button
Feb 16, 2022
Yihan Du, Wei Chen

Figure 1 for Branching Reinforcement Learning
Figure 2 for Branching Reinforcement Learning
Viaarxiv icon

Collaborative Pure Exploration in Kernel Bandit

Add code
Bookmark button
Alert button
Oct 29, 2021
Yihan Du, Wei Chen, Yuko Yuroki, Longbo Huang

Figure 1 for Collaborative Pure Exploration in Kernel Bandit
Figure 2 for Collaborative Pure Exploration in Kernel Bandit
Viaarxiv icon

Combinatorial Pure Exploration with Bottleneck Reward Function and its Extension to General Reward Functions

Add code
Bookmark button
Alert button
Feb 24, 2021
Yihan Du, Yuko Kuroki, Wei Chen

Figure 1 for Combinatorial Pure Exploration with Bottleneck Reward Function and its Extension to General Reward Functions
Figure 2 for Combinatorial Pure Exploration with Bottleneck Reward Function and its Extension to General Reward Functions
Figure 3 for Combinatorial Pure Exploration with Bottleneck Reward Function and its Extension to General Reward Functions
Figure 4 for Combinatorial Pure Exploration with Bottleneck Reward Function and its Extension to General Reward Functions
Viaarxiv icon

Continuous Mean-Covariance Bandits

Add code
Bookmark button
Alert button
Feb 24, 2021
Yihan Du, Siwei Wang, Zhixuan Fang, Longbo Huang

Figure 1 for Continuous Mean-Covariance Bandits
Figure 2 for Continuous Mean-Covariance Bandits
Figure 3 for Continuous Mean-Covariance Bandits
Viaarxiv icon