Abstract:Large Language Models (LLMs) have demonstrated strong performance on tasks with short time frames, but struggle with tasks requiring longer durations. While datasets covering extended-duration tasks, such as software engineering tasks or video games, do exist, there are currently few implementations of complex board games specifically designed for reinforcement learning and LLM evaluation. To address this gap, we propose the 4Hammer reinforcement learning environment, a digital twin simulation of a subset of Warhammer 40,000-a complex, zero-sum board game. Warhammer 40,000 features intricate rules, requiring human players to thoroughly read and understand over 50 pages of detailed natural language rules, grasp the interactions between their game pieces and those of their opponents, and independently track and communicate the evolving game state.
Abstract:Reinforcement learning (RL) algorithms, due to their reliance on external systems to learn from, require digital environments (e.g., simulators) with very simple interfaces, which in turn constrain significantly the implementation of such environments. In particular, these environments are implemented either as separate processes or as state machines, leading to synchronization and communication overheads in the first case, and to unstructured programming in the second. We propose a new domain-specific, co-routine-based, compiled language, called Rulebook, designed to automatically generate the state machine required to interact with machine learning (ML) algorithms and similar applications, with no performance overhead. Rulebook allows users to express programs without needing to be aware of the specific interface required by the ML components. By decoupling the execution model of the program from the syntactical encoding of the program, and thus without the need for manual state management, Rulebook allows to create larger and more sophisticated environments at a lower development cost.