Nagoya Institute of Technology
Abstract:This paper introduces an open-source simulator, BeliefNest, designed to enable embodied agents to perform collaborative tasks by leveraging Theory of Mind. BeliefNest dynamically and hierarchically constructs simulators within a Minecraft environment, allowing agents to explicitly represent nested belief states about themselves and others. This enables agent control in open-domain tasks that require Theory of Mind reasoning. The simulator provides a prompt generation mechanism based on each belief state, facilitating the design and evaluation of methods for agent control utilizing large language models (LLMs). We demonstrate through experiments that agents can infer others' beliefs and predict their belief-based actions in false-belief tasks.
Abstract:This paper proposes methods for unsupervised lexical acquisition for relative spatial concepts using spoken user utterances. A robot with a flexible spoken dialog system must be able to acquire linguistic representation and its meaning specific to an environment through interactions with humans as children do. Specifically, relative spatial concepts (e.g., front and right) are widely used in our daily lives, however, it is not obvious which object is a reference object when a robot learns relative spatial concepts. Therefore, we propose methods by which a robot without prior knowledge of words can learn relative spatial concepts. The methods are formulated using a probabilistic model to estimate the proper reference objects and distributions representing concepts simultaneously. The experimental results show that relative spatial concepts and a phoneme sequence representing each concept can be learned under the condition that the robot does not know which located object is the reference object. Additionally, we show that two processes in the proposed method improve the estimation accuracy of the concepts: generating candidate word sequences by class n-gram and selecting word sequences using location information. Furthermore, we show that clues to reference objects improve accuracy even though the number of candidate reference objects increases.