Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Takato Horii

Osaka University

Social Reality Construction via Active Inference: Modeling the Dialectic of Conformity and Creativity

Apr 10, 2026

Kentaro Nomura, Takato Horii

Abstract:Social agents both internalize collective norms and reshape them through creative action, yet computational models have not captured this bidirectional process within a unified framework. We propose a multi-agent simulation model grounded in active inference that formalizes the dialectical constitution of social reality on a structured social network. Each agent maintains an internal generative model, communicates with neighbors to form social priors, creates novel observations, and selectively incorporates others' creations into memory. Simulation experiments demonstrate three main findings. First, informationally cohesive social groups emerge endogenously, with representational alignment mirroring the cluster topology of the underlying network. Second, a circular mutual constitution arises between social representations and the observation distribution, maintained through agents' creative acts that project representational structure onto the external world. Third, the propagation of creations exhibits selective, heterogeneous patterns distinct from the stable diffusion of social representations, indicating that agents construct cultural niches through local interaction dynamics. These results suggest that the interplay between social conformity and creative deviation can give rise to the endogenous formation and differentiation of shared social reality.

* Submitted to ALIFE 2026 conference

Via

Access Paper or Ask Questions

External Photoreflective Tactile Sensing Based on Surface Deformation Measurement

Nov 09, 2025

Seiichi Yamamoto, Hiroki Ishizuka, Takumi Kawasetsu, Koh Hosoda, Takayuki Kameoka, Kango Yanagida, Takato Horii, Sei Ikeda, Osamu Oshiro

Abstract:We present a tactile sensing method enabled by the mechanical compliance of soft robots; an externally attachable photoreflective module reads surface deformation of silicone skin to estimate contact force without embedding tactile transducers. Locating the sensor off the contact interface reduces damage risk, preserves softness, and simplifies fabrication and maintenance. We first characterize the optical sensing element and the compliant skin, thendetermine the design of a prototype tactile sensor. Compression experiments validate the approach, exhibiting a monotonic force output relationship consistent with theory, low hysteresis, high repeatability over repeated cycles, and small response indentation speeds. We further demonstrate integration on a soft robotic gripper, where the module reliably detects grasp events. Compared with liquid filled or wireembedded tactile skins, the proposed modular add on architecture enhances durability, reduces wiring complexity, and supports straightforward deployment across diverse robot geometries. Because the sensing principle reads skin strain patterns, it also suggests extensions to other somatosensory cues such as joint angle or actuator state estimation from surface deformation. Overall, leveraging surface compliance with an external optical module provides a practical and robust route to equip soft robots with force perception while preserving structural flexibility and manufacturability, paving the way for robotic applications and safe human robot collaboration.

* This work has been submitted to the IEEE for possible publication

Via

Access Paper or Ask Questions

Correspondence of high-dimensional emotion structures elicited by video clips between humans and Multimodal LLMs

May 19, 2025

Haruka Asanuma, Naoko Koide-Majima, Ken Nakamura, Takato Horii, Shinji Nishimoto, Masafumi Oizumi

Abstract:Recent studies have revealed that human emotions exhibit a high-dimensional, complex structure. A full capturing of this complexity requires new approaches, as conventional models that disregard high dimensionality risk overlooking key nuances of human emotions. Here, we examined the extent to which the latest generation of rapidly evolving Multimodal Large Language Models (MLLMs) capture these high-dimensional, intricate emotion structures, including capabilities and limitations. Specifically, we compared self-reported emotion ratings from participants watching videos with model-generated estimates (e.g., Gemini or GPT). We evaluated performance not only at the individual video level but also from emotion structures that account for inter-video relationships. At the level of simple correlation between emotion structures, our results demonstrated strong similarity between human and model-inferred emotion structures. To further explore whether the similarity between humans and models is at the signle item level or the coarse-categorical level, we applied Gromov Wasserstein Optimal Transport. We found that although performance was not necessarily high at the strict, single-item level, performance across video categories that elicit similar emotions was substantial, indicating that the model could infer human emotional experiences at the category level. Our results suggest that current state-of-the-art MLLMs broadly capture the complex high-dimensional emotion structures at the category level, as well as their apparent limitations in accurately capturing entire structures at the single-item level.

* 25 pages, 7 figures

Via

Access Paper or Ask Questions

Decentralized Collective World Model for Emergent Communication and Coordination

Apr 04, 2025

Kentaro Nomura, Tatsuya Aoki, Tadahiro Taniguchi, Takato Horii

Figure 1 for Decentralized Collective World Model for Emergent Communication and Coordination

Figure 2 for Decentralized Collective World Model for Emergent Communication and Coordination

Figure 3 for Decentralized Collective World Model for Emergent Communication and Coordination

Figure 4 for Decentralized Collective World Model for Emergent Communication and Coordination

Abstract:We propose a fully decentralized multi-agent world model that enables both symbol emergence for communication and coordinated behavior through temporal extension of collective predictive coding. Unlike previous research that focuses on either communication or coordination separately, our approach achieves both simultaneously. Our method integrates world models with communication channels, enabling agents to predict environmental dynamics, estimate states from partial observations, and share critical information through bidirectional message exchange with contrastive learning for message alignment. Using a two-agent trajectory drawing task, we demonstrate that our communication-based approach outperforms non-communicative models when agents have divergent perceptual capabilities, achieving the second-best coordination after centralized models. Importantly, our distributed approach with constraints preventing direct access to other agents' internal states facilitates the emergence of more meaningful symbol systems that accurately reflect environmental states. These findings demonstrate the effectiveness of decentralized communication for supporting coordination while developing shared representations of the environment.

Via

Access Paper or Ask Questions

System 0/1/2/3: Quad-process theory for multi-timescale embodied collective cognitive systems

Mar 08, 2025

Tadahiro Taniguchi, Yasushi Hirai, Masahiro Suzuki, Shingo Murata, Takato Horii, Kazutoshi Tanaka

Figure 1 for System 0/1/2/3: Quad-process theory for multi-timescale embodied collective cognitive systems

Figure 2 for System 0/1/2/3: Quad-process theory for multi-timescale embodied collective cognitive systems

Figure 3 for System 0/1/2/3: Quad-process theory for multi-timescale embodied collective cognitive systems

Figure 4 for System 0/1/2/3: Quad-process theory for multi-timescale embodied collective cognitive systems

Abstract:This paper introduces the System 0/1/2/3 framework as an extension of dual-process theory, employing a quad-process model of cognition. Expanding upon System 1 (fast, intuitive thinking) and System 2 (slow, deliberative thinking), we incorporate System 0, which represents pre-cognitive embodied processes, and System 3, which encompasses collective intelligence and symbol emergence. We contextualize this model within Bergson's philosophy by adopting multi-scale time theory to unify the diverse temporal dynamics of cognition. System 0 emphasizes morphological computation and passive dynamics, illustrating how physical embodiment enables adaptive behavior without explicit neural processing. Systems 1 and 2 are explained from a constructive perspective, incorporating neurodynamical and AI viewpoints. In System 3, we introduce collective predictive coding to explain how societal-level adaptation and symbol emergence operate over extended timescales. This comprehensive framework ranges from rapid embodied reactions to slow-evolving collective intelligence, offering a unified perspective on cognition across multiple timescales, levels of abstraction, and forms of human intelligence. The System 0/1/2/3 model provides a novel theoretical foundation for understanding the interplay between adaptive and cognitive processes, thereby opening new avenues for research in cognitive science, AI, robotics, and collective intelligence.

* Under review

Via

Access Paper or Ask Questions

Creative Agents: Simulating the Systems Model of Creativity with Generative Agents

Nov 26, 2024

Naomi Imasato, Kazuki Miyazawa, Takayuki Nagai, Takato Horii

Figure 1 for Creative Agents: Simulating the Systems Model of Creativity with Generative Agents

Figure 2 for Creative Agents: Simulating the Systems Model of Creativity with Generative Agents

Figure 3 for Creative Agents: Simulating the Systems Model of Creativity with Generative Agents

Figure 4 for Creative Agents: Simulating the Systems Model of Creativity with Generative Agents

Abstract:With the growing popularity of generative AI for images, video, and music, we witnessed models rapidly improve in quality and performance. However, not much attention is paid towards enabling AI's ability to "be creative". In this study, we implemented and simulated the systems model of creativity (proposed by Csikszentmihalyi) using virtual agents utilizing large language models (LLMs) and text prompts. For comparison, the simulations were conducted with the "virtual artists" being: 1)isolated and 2)placed in a multi-agent system. Both scenarios were compared by analyzing the variations and overall "creativity" in the generated artifacts (measured via a user study and LLM). Our results suggest that the generative agents may perform better in the framework of the systems model of creativity.

Via

Access Paper or Ask Questions

Predictive Reachability for Embodiment Selection in Mobile Manipulation Behaviors

Oct 28, 2024

Xiaoxu Feng, Takato Horii, Takayuki Nagai

Figure 1 for Predictive Reachability for Embodiment Selection in Mobile Manipulation Behaviors

Figure 2 for Predictive Reachability for Embodiment Selection in Mobile Manipulation Behaviors

Figure 3 for Predictive Reachability for Embodiment Selection in Mobile Manipulation Behaviors

Figure 4 for Predictive Reachability for Embodiment Selection in Mobile Manipulation Behaviors

Abstract:Mobile manipulators require coordinated control between navigation and manipulation to accomplish tasks. Typically, coordinated mobile manipulation behaviors have base navigation to approach the goal followed by arm manipulation to reach the desired pose. Selecting the embodiment between the base and arm can be determined based on reachability. Previous methods evaluate reachability by computing inverse kinematics and activate arm motions once solutions are identified. In this study, we introduce a new approach called predictive reachability that decides reachability based on predicted arm motions. Our model utilizes a hierarchical policy framework built upon a world model. The world model allows the prediction of future trajectories and the evaluation of reachability. The hierarchical policy selects the embodiment based on the predicted reachability and plans accordingly. Unlike methods that require prior knowledge about robots and environments for inverse kinematics, our method only relies on image-based observations. We evaluate our approach through basic reaching tasks across various environments. The results demonstrate that our method outperforms previous model-based approaches in both sample efficiency and performance, while enabling more reasonable embodiment selection based on predictive reachability.

Via

Access Paper or Ask Questions

LiP-LLM: Integrating Linear Programming and dependency graph with Large Language Models for multi-robot task planning

Oct 28, 2024

Kazuma Obata, Tatsuya Aoki, Takato Horii, Tadahiro Taniguchi, Takayuki Nagai

Figure 1 for LiP-LLM: Integrating Linear Programming and dependency graph with Large Language Models for multi-robot task planning

Figure 2 for LiP-LLM: Integrating Linear Programming and dependency graph with Large Language Models for multi-robot task planning

Figure 3 for LiP-LLM: Integrating Linear Programming and dependency graph with Large Language Models for multi-robot task planning

Figure 4 for LiP-LLM: Integrating Linear Programming and dependency graph with Large Language Models for multi-robot task planning

Abstract:This study proposes LiP-LLM: integrating linear programming and dependency graph with large language models (LLMs) for multi-robot task planning. In order for multiple robots to perform tasks more efficiently, it is necessary to manage the precedence dependencies between tasks. Although multi-robot decentralized and centralized task planners using LLMs have been proposed, none of these studies focus on precedence dependencies from the perspective of task efficiency or leverage traditional optimization methods. It addresses key challenges in managing dependencies between skills and optimizing task allocation. LiP-LLM consists of three steps: skill list generation and dependency graph generation by LLMs, and task allocation using linear programming. The LLMs are utilized to generate a comprehensive list of skills and to construct a dependency graph that maps the relationships and sequential constraints among these skills. To ensure the feasibility and efficiency of skill execution, the skill list is generated by calculated likelihood, and linear programming is used to optimally allocate tasks to each robot. Experimental evaluations in simulated environments demonstrate that this method outperforms existing task planners, achieving higher success rates and efficiency in executing complex, multi-robot tasks. The results indicate the potential of combining LLMs with optimization techniques to enhance the capabilities of multi-robot systems in executing coordinated tasks accurately and efficiently. In an environment with two robots, a maximum success rate difference of 0.82 is observed in the language instruction group with a change in the object name.

Via

Access Paper or Ask Questions

Constructive Approach to Bidirectional Causation between Qualia Structure and Language Emergence

Sep 14, 2024

Tadahiro Taniguchi, Masafumi Oizumi, Noburo Saji, Takato Horii, Naotsugu Tsuchiya

Figure 1 for Constructive Approach to Bidirectional Causation between Qualia Structure and Language Emergence

Figure 2 for Constructive Approach to Bidirectional Causation between Qualia Structure and Language Emergence

Figure 3 for Constructive Approach to Bidirectional Causation between Qualia Structure and Language Emergence

Figure 4 for Constructive Approach to Bidirectional Causation between Qualia Structure and Language Emergence

Abstract:This paper presents a novel perspective on the bidirectional causation between language emergence and relational structure of subjective experiences, termed qualia structure, and lays out the constructive approach to the intricate dependency between the two. We hypothesize that languages with distributional semantics, e.g., syntactic-semantic structures, may have emerged through the process of aligning internal representations among individuals, and such alignment of internal representations facilitates more structured language. This mutual dependency is suggested by the recent advancements in AI and symbol emergence robotics, and collective predictive coding (CPC) hypothesis, in particular. Computational studies show that neural network-based language models form systematically structured internal representations, and multimodal language models can share representations between language and perceptual information. This perspective suggests that language emergence serves not only as a mechanism creating a communication tool but also as a mechanism for allowing people to realize shared understanding of qualitative experiences. The paper discusses the implications of this bidirectional causation in the context of consciousness studies, linguistics, and cognitive science, and outlines future constructive research directions to further explore this dynamic relationship between language emergence and qualia structure.

* 20 pages, 4 Figures

Via

Access Paper or Ask Questions

Goal Estimation-based Adaptive Shared Control for Brain-Machine Interfaces Remote Robot Navigation

Jul 25, 2024

Tomoka Muraoka, Tatsuya Aoki, Masayuki Hirata, Tadahiro Taniguchi, Takato Horii, Takayuki Nagai

Figure 1 for Goal Estimation-based Adaptive Shared Control for Brain-Machine Interfaces Remote Robot Navigation

Figure 2 for Goal Estimation-based Adaptive Shared Control for Brain-Machine Interfaces Remote Robot Navigation

Figure 3 for Goal Estimation-based Adaptive Shared Control for Brain-Machine Interfaces Remote Robot Navigation

Figure 4 for Goal Estimation-based Adaptive Shared Control for Brain-Machine Interfaces Remote Robot Navigation

Abstract:In this study, we propose a shared control method for teleoperated mobile robots using brain-machine interfaces (BMI). The control commands generated through BMI for robot operation face issues of low input frequency, discreteness, and uncertainty due to noise. To address these challenges, our method estimates the user's intended goal from their commands and uses this goal to generate auxiliary commands through the autonomous system that are both at a higher input frequency and more continuous. Furthermore, by defining the confidence level of the estimation, we adaptively calculated the weights for combining user and autonomous commands, thus achieving shared control.

Via

Access Paper or Ask Questions