Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Petr Kuderov

CrafText Benchmark: Advancing Instruction Following in Complex Multimodal Open-Ended World

May 17, 2025

Zoya Volovikova, Gregory Gorbov, Petr Kuderov, Aleksandr I. Panov, Alexey Skrynnik

Abstract:Following instructions in real-world conditions requires the ability to adapt to the world's volatility and entanglement: the environment is dynamic and unpredictable, instructions can be linguistically complex with diverse vocabulary, and the number of possible goals an agent may encounter is vast. Despite extensive research in this area, most studies are conducted in static environments with simple instructions and a limited vocabulary, making it difficult to assess agent performance in more diverse and challenging settings. To address this gap, we introduce CrafText, a benchmark for evaluating instruction following in a multimodal environment with diverse instructions and dynamic interactions. CrafText includes 3,924 instructions with 3,423 unique words, covering Localization, Conditional, Building, and Achievement tasks. Additionally, we propose an evaluation protocol that measures an agent's ability to generalize to novel instruction formulations and dynamically evolving task configurations, providing a rigorous test of both linguistic understanding and adaptive decision-making.

Via

Access Paper or Ask Questions

Instruction Following with Goal-Conditioned Reinforcement Learning in Virtual Environments

Jul 12, 2024

Zoya Volovikova, Alexey Skrynnik, Petr Kuderov, Aleksandr I. Panov

Figure 1 for Instruction Following with Goal-Conditioned Reinforcement Learning in Virtual Environments

Figure 2 for Instruction Following with Goal-Conditioned Reinforcement Learning in Virtual Environments

Figure 3 for Instruction Following with Goal-Conditioned Reinforcement Learning in Virtual Environments

Figure 4 for Instruction Following with Goal-Conditioned Reinforcement Learning in Virtual Environments

Abstract:In this study, we address the issue of enabling an artificial intelligence agent to execute complex language instructions within virtual environments. In our framework, we assume that these instructions involve intricate linguistic structures and multiple interdependent tasks that must be navigated successfully to achieve the desired outcomes. To effectively manage these complexities, we propose a hierarchical framework that combines the deep language comprehension of large language models with the adaptive action-execution capabilities of reinforcement learning agents. The language module (based on LLM) translates the language instruction into a high-level action plan, which is then executed by a pre-trained reinforcement learning agent. We have demonstrated the effectiveness of our approach in two different environments: in IGLU, where agents are instructed to build structures, and in Crafter, where agents perform tasks and interact with objects in the surrounding environment according to language commands.

Via

Access Paper or Ask Questions

Learning Successor Representations with Distributed Hebbian Temporal Memory

Oct 20, 2023

Evgenii Dzhivelikian, Petr Kuderov, Aleksandr I. Panov

Figure 1 for Learning Successor Representations with Distributed Hebbian Temporal Memory

Figure 2 for Learning Successor Representations with Distributed Hebbian Temporal Memory

Figure 3 for Learning Successor Representations with Distributed Hebbian Temporal Memory

Figure 4 for Learning Successor Representations with Distributed Hebbian Temporal Memory

Abstract:This paper presents a novel approach to address the challenge of online hidden representation learning for decision-making under uncertainty in non-stationary, partially observable environments. The proposed algorithm, Distributed Hebbian Temporal Memory (DHTM), is based on factor graph formalism and a multicomponent neuron model. DHTM aims to capture sequential data relationships and make cumulative predictions about future observations, forming Successor Representation (SR). Inspired by neurophysiological models of the neocortex, the algorithm utilizes distributed representations, sparse transition matrices, and local Hebbian-like learning rules to overcome the instability and slow learning process of traditional temporal memory algorithms like RNN and HMM. Experimental results demonstrate that DHTM outperforms classical LSTM and performs comparably to more advanced RNN-like algorithms, speeding up Temporal Difference learning for SR in changing environments. Additionally, we compare the SRs produced by DHTM to another biologically inspired HMM-like algorithm, CSCG. Our findings suggest that DHTM is a promising approach for addressing the challenges of online hidden representation learning in dynamic environments.

* 12 pages, 4 figures

Via

Access Paper or Ask Questions