Picture for Yaniv Oren

Yaniv Oren

AlphaExploitem: Going Beyond the Nash Equilibrium in Poker by Learning to Exploit Suboptimal Play

Add code
May 09, 2026
Viaarxiv icon

Parallelizing Tree Search with Twice Sequential Monte Carlo

Add code
Nov 18, 2025
Viaarxiv icon

Bridging the Performance Gap Between Target-Free and Target-Based Reinforcement Learning With Iterated Q-Learning

Add code
Jun 04, 2025
Viaarxiv icon

Universal Value-Function Uncertainties

Add code
May 27, 2025
Figure 1 for Universal Value-Function Uncertainties
Figure 2 for Universal Value-Function Uncertainties
Figure 3 for Universal Value-Function Uncertainties
Figure 4 for Universal Value-Function Uncertainties
Viaarxiv icon

Trust-Region Twisted Policy Improvement

Add code
Apr 08, 2025
Figure 1 for Trust-Region Twisted Policy Improvement
Figure 2 for Trust-Region Twisted Policy Improvement
Figure 3 for Trust-Region Twisted Policy Improvement
Figure 4 for Trust-Region Twisted Policy Improvement
Viaarxiv icon

Value Improved Actor Critic Algorithms

Add code
Jun 03, 2024
Viaarxiv icon