Picture for Eugene Vinitsky

Eugene Vinitsky

Superhuman AI for Stratego Using Self-Play Reinforcement Learning and Test-Time Search

Add code
Nov 10, 2025
Viaarxiv icon

Estimating cognitive biases with attention-aware inverse planning

Add code
Oct 29, 2025
Figure 1 for Estimating cognitive biases with attention-aware inverse planning
Figure 2 for Estimating cognitive biases with attention-aware inverse planning
Figure 3 for Estimating cognitive biases with attention-aware inverse planning
Figure 4 for Estimating cognitive biases with attention-aware inverse planning
Viaarxiv icon

Video Game Level Design as a Multi-Agent Reinforcement Learning Problem

Add code
Oct 06, 2025
Viaarxiv icon

Building reliable sim driving agents by scaling self-play

Add code
Feb 20, 2025
Figure 1 for Building reliable sim driving agents by scaling self-play
Figure 2 for Building reliable sim driving agents by scaling self-play
Figure 3 for Building reliable sim driving agents by scaling self-play
Figure 4 for Building reliable sim driving agents by scaling self-play
Viaarxiv icon

Reevaluating Policy Gradient Methods for Imperfect-Information Games

Add code
Feb 13, 2025
Viaarxiv icon

Robust Autonomy Emerges from Self-Play

Add code
Feb 05, 2025
Viaarxiv icon

Few-shot In-Context Preference Learning Using Large Language Models

Add code
Oct 22, 2024
Figure 1 for Few-shot In-Context Preference Learning Using Large Language Models
Figure 2 for Few-shot In-Context Preference Learning Using Large Language Models
Figure 3 for Few-shot In-Context Preference Learning Using Large Language Models
Figure 4 for Few-shot In-Context Preference Learning Using Large Language Models
Viaarxiv icon

GPUDrive: Data-driven, multi-agent driving simulation at 1 million FPS

Add code
Aug 02, 2024
Viaarxiv icon

Human-compatible driving partners through data-regularized self-play reinforcement learning

Add code
Mar 28, 2024
Figure 1 for Human-compatible driving partners through data-regularized self-play reinforcement learning
Figure 2 for Human-compatible driving partners through data-regularized self-play reinforcement learning
Figure 3 for Human-compatible driving partners through data-regularized self-play reinforcement learning
Figure 4 for Human-compatible driving partners through data-regularized self-play reinforcement learning
Viaarxiv icon

Reinforcement Learning Based Oscillation Dampening: Scaling up Single-Agent RL algorithms to a 100 AV highway field operational test

Add code
Feb 26, 2024
Viaarxiv icon